qemu-e2k

Author	SHA1	Message	Date
Denis V. Lunev	bb16991fe0	parallels: fix formatting in bdrv_parallels initialization Old code is ugly and contains tabulations. There are no functional changes in this patch. Signed-off-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com>	2023-09-21 08:45:00 +02:00
Stefan Hajnoczi	0f86afde66	block-backend: process zoned requests in the current AioContext Process zoned requests in the current thread's AioContext instead of in the BlockBackend's AioContext. There is no need to use the BlockBackend's AioContext thanks to CoMutex bs->wps->colock, which protects zone metadata. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230912231037.826804-5-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Stefan Hajnoczi	46eb6e861b	block-backend: process I/O in the current AioContext Switch blk_aio_() APIs over to multi-queue by using qemu_get_current_aio_context() instead of blk_get_aio_context(). This change will allow devices to process I/O in multiple IOThreads in the future. I audited existing blk_aio_() callers: - migration/block.c: blk_mig_lock() protects the data accessed by the completion callback. - The remaining emulated devices and exports run with qemu_get_aio_context() == blk_get_aio_context(). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230912231037.826804-4-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Stefan Hajnoczi	652b0dd808	block: remove AIOCBInfo->get_aio_context() The synchronous bdrv_aio_cancel() function needs the acb's AioContext so it can call aio_poll() to wait for cancellation. It turns out that all users run under the BQL in the main AioContext, so this callback is not needed. Remove the callback, mark bdrv_aio_cancel() GLOBAL_STATE_CODE just like its blk_aio_cancel() caller, and poll the main loop AioContext. The purpose of this cleanup is to identify bdrv_aio_cancel() as an API that does not work with the multi-queue block layer. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230912231037.826804-2-stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Klaus Jensen <k.jensen@samsung.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Andrey Drobyshev via	2848289168	block: add BDRV_BLOCK_COMPRESSED flag for bdrv_block_status() Functions qcow2_get_host_offset(), get_cluster_offset(), vmdk_co_block_status() explicitly report compressed cluster types when data is compressed. However, this information is never passed further. Let's make use of it by adding new BDRV_BLOCK_COMPRESSED flag for bdrv_block_status(), so that caller may know that the data range is compressed. In particular, we're going to use this flag to tweak "qemu-img map" output. This new flag is only being utilized by qcow, qcow2 and vmdk formats, as only those support compression. Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> Message-ID: <20230907210226.953821-2-andrey.drobyshev@virtuozzo.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	9def6082cf	block: Mark bdrv_add/del_child() and caller GRAPH_WRLOCK The functions read the parents list in the generic block layer, so we need to hold the graph lock already there. The BlockDriver implementations actually modify the graph, so it has to be a writer lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-22-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	32a8aba37e	block: Mark bdrv_unref_child() GRAPH_WRLOCK Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_unref_child(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-21-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	ede01e4635	block: Mark bdrv_root_unref_child() GRAPH_WRLOCK Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_root_unref_child(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-20-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	c629b6d223	block: Mark bdrv_child_perm() GRAPH_RDLOCK This adds GRAPH_RDLOCK annotations to declare that callers of bdrv_child_perm() need to hold a reader lock for the graph because some implementations access the children list of a node. The callers of bdrv_child_perm() conveniently already hold the lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-16-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	3804e3cf54	block: Mark bdrv_parent_perms_conflict() and callers GRAPH_RDLOCK The function reads the parents list, so it needs to hold the graph lock. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-14-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	afdaeb9ea0	block: Mark bdrv_attach_child() GRAPH_WRLOCK Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_attach_child_common(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-13-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	7d4ca9d25b	block: Mark bdrv_attach_child_common() GRAPH_WRLOCK Instead of taking the writer lock internally, require callers to already hold it when calling bdrv_attach_child_common(). These callers will typically already hold the graph lock once the locking work is completed, which means that they can't call functions that take it internally. Note that the transaction callbacks still take the lock internally, so tran_finalize() must be called without the lock held. This is because bdrv_append() also calls bdrv_replace_node_noperm(), which currently requires the transaction callbacks to be called unlocked. In the next step, both of them can be switched to locked tran_finalize() calls together. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-11-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	ac2ae233a0	block: Introduce bdrv_schedule_unref() bdrv_unref() is called by a lot of places that need to hold the graph lock (it naturally happens in the context of operations that change the graph). However, bdrv_unref() takes the graph writer lock internally, so it can't actually be called while already holding a graph lock without causing a deadlock. bdrv_unref() also can't just become GRAPH_WRLOCK because it drains the node before closing it, and draining requires that the graph is unlocked. The solution is to defer deleting the node until we don't hold the lock any more and draining is possible again. Note that keeping images open for longer than necessary can create problems, too: You can't open an image again before it is really closed (if image locking didn't prevent it, it would cause corruption). Reopening an image immediately happens at least during bdrv_open() and bdrv_co_create(). In order to solve this problem, make sure to run the deferred unref in bdrv_graph_wrunlock(), i.e. the first possible place where we can drain again. This is also why bdrv_schedule_unref() is marked GRAPH_WRLOCK. The output of iotest 051 is updated because the additional polling changes the order of HMP output, resulting in a new "(qemu)" prompt in the test output that was previously on a separate line and filtered out. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230911094620.45040-6-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:01 +02:00
Kevin Wolf	edcce17b15	preallocate: Don't poll during permission updates When the permission related BlockDriver callbacks are called, we are in the middle of an operation traversing the block graph. Polling in such a place is a very bad idea because the graph could change in unexpected ways. In the future, callers will also hold the graph lock, which is likely to turn polling into a deadlock. So we need to get rid of calls to functions like bdrv_getlength() or bdrv_truncate() there as these functions poll internally. They are currently used so that when no parent has write/resize permissions on the image any more, the preallocate filter drops the extra preallocated area in the image file and gives up write/resize permissions itself. In order to achieve this without polling in .bdrv_check_perm, don't immediately truncate the image, but only schedule a BH to do so. The filter keeps the write/resize permissions a bit longer now until the BH has executed. There is one case in which delaying doesn't work: Reopening the image read-only. In this case, bs->file will likely be reopened read-only, too, so keeping write permissions a bit longer on it doesn't work. But we can already cover this case in preallocate_reopen_prepare() and not rely on the permission updates for it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-4-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:00 +02:00
Kevin Wolf	01e28f6006	preallocate: Factor out preallocate_truncate_to_real_size() It's essentially the same code in preallocate_check_perm() and preallocate_close(), except that the latter ignores errors. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230911094620.45040-3-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-20 17:46:00 +02:00
Stefan Hajnoczi	78f8b6d9c8	Block layer patches - Optimise reqs_lock to make multiqueue actually scale - virtio: Drop out of coroutine context in virtio_load() - iotests: Fix reference output for some tests after recent changes - vpc: Avoid dynamic stack allocation - Code cleanup, improved documentation -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmT7VYgRHGt3b2xmQHJl ZGhhdC5jb20ACgkQfwmycsiPL9YfOg/7BoYF6lkB7DF/jH3XLY6f8zoI+OVM7dg1 QFEjyVO+uZiJVh0CeBNI9WgnBe7f5vXMbiStyGbWKo3BLUsjnwoQcW/Sxpw61bR2 jZYK6UHe0RhFqTQpbt8G1iCmlpRS+sX+Cy+lxcVcbqxcnLRXCOjT6ivyA4bGbYIC q9BHg/9hBmjuM05NTV6Axy8qjqBGVaIWE9ALTnw8H//waBr4/ydJPTl7EWHe3+tO Stm73evgPG7aLHM6W4qdFW4gwAQ8f+f42Q+0NH1YavB/pN3LTN1B6sLQY/51du+0 d/JCsXex0IZQXmNPhqv1h01vhOyU9WBmlwpPG2iZv3a06SXk1ys3rQt/L7uIcsZg Z58CpcUJ517FERnkl0BWXzYhsdcW2K+RdlaiL5PX6H1A2B9LT05ouZfD47hh7kKv oX+Ulk05PFr3JRCKQF6QDEejRKXt169bGzInTlns/wXinD/V4sCkUnr9aWQuhoWk KhQm7WMscTTIyHP2FznO4x9kq0ALsoX/NKqBW2wgJUtqRzsd4XxPp5CXEsAir8Vt dpne/DaV5iDI1mGFJrvkctJN545tEoezBtUzC8/9rZGE0cxHAkhvQVZUDo7xVmrq PlGQ1ko9cNui/Gf9B6qDqaJJwSyw0S6vHurGVQJRwbyly57Fi5aisWkr4w7Rc4eA 7u9B1RvwF/Q= =2wGD -----END PGP SIGNATURE----- Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging Block layer patches - Optimise reqs_lock to make multiqueue actually scale - virtio: Drop out of coroutine context in virtio_load() - iotests: Fix reference output for some tests after recent changes - vpc: Avoid dynamic stack allocation - Code cleanup, improved documentation # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmT7VYgRHGt3b2xmQHJl # ZGhhdC5jb20ACgkQfwmycsiPL9YfOg/7BoYF6lkB7DF/jH3XLY6f8zoI+OVM7dg1 # QFEjyVO+uZiJVh0CeBNI9WgnBe7f5vXMbiStyGbWKo3BLUsjnwoQcW/Sxpw61bR2 # jZYK6UHe0RhFqTQpbt8G1iCmlpRS+sX+Cy+lxcVcbqxcnLRXCOjT6ivyA4bGbYIC # q9BHg/9hBmjuM05NTV6Axy8qjqBGVaIWE9ALTnw8H//waBr4/ydJPTl7EWHe3+tO # Stm73evgPG7aLHM6W4qdFW4gwAQ8f+f42Q+0NH1YavB/pN3LTN1B6sLQY/51du+0 # d/JCsXex0IZQXmNPhqv1h01vhOyU9WBmlwpPG2iZv3a06SXk1ys3rQt/L7uIcsZg # Z58CpcUJ517FERnkl0BWXzYhsdcW2K+RdlaiL5PX6H1A2B9LT05ouZfD47hh7kKv # oX+Ulk05PFr3JRCKQF6QDEejRKXt169bGzInTlns/wXinD/V4sCkUnr9aWQuhoWk # KhQm7WMscTTIyHP2FznO4x9kq0ALsoX/NKqBW2wgJUtqRzsd4XxPp5CXEsAir8Vt # dpne/DaV5iDI1mGFJrvkctJN545tEoezBtUzC8/9rZGE0cxHAkhvQVZUDo7xVmrq # PlGQ1ko9cNui/Gf9B6qDqaJJwSyw0S6vHurGVQJRwbyly57Fi5aisWkr4w7Rc4eA # 7u9B1RvwF/Q= # =2wGD # -----END PGP SIGNATURE----- # gpg: Signature made Fri 08 Sep 2023 13:10:32 EDT # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * tag 'for-upstream' of https://repo.or.cz/qemu/kevin: virtio: Drop out of coroutine context in virtio_load() vmstate: Mark VMStateInfo.get/put() coroutine_mixed_fn block: Make more BlockDriver definitions static block/meson.build: Restore alphabetical order of files block: Remove unnecessary variable in bdrv_block_device_info block: Remove bdrv_query_block_node_info vmdk: Clean up bdrv_open_child() return value check qemu-img: Update documentation for compressed images block: Be more verbose in create fallback block/iscsi: Document why we use raw malloc() qemu-img: omit errno value in error message block: change reqs_lock to QemuMutex block: minimize bs->reqs_lock section in tracked_request_end() iotests: adapt test output for new qemu_cleanup() behavior block/vpc: Avoid dynamic stack allocation Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-09-11 09:11:22 -04:00
Kevin Wolf	9ea473fb7b	block: Make more BlockDriver definitions static Most block driver implementations don't have any reason for their BlockDriver to be public. The only exceptions are bdrv_file, bdrv_raw and bdrv_qcow2, which are actually used in other source files. Make all other BlockDriver definitions static if they aren't yet. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230905130607.35134-3-kwolf@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-08 17:03:09 +02:00
Kevin Wolf	9e03a5e195	block/meson.build: Restore alphabetical order of files When commit `5e5733e599` created block/meson.build, the list of unconditionally added files was in alphabetical order. Later commits added new files in random places. Reorder the list to be alphabetical again. (As for ordering foo.c against foo-*.c, there are both ways used currently; standardise on having foo.c first, even though this is different from the original commit 5e5733e5999.) Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230905130607.35134-2-kwolf@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-08 17:03:09 +02:00
Fabiano Rosas	c3b29ae6b4	block: Remove unnecessary variable in bdrv_block_device_info The commit `5d8813593f` ("block/qapi: Let bdrv_query_image_info() recurse") removed the loop where we set the 'bs0' variable, so now it is just the same as 'bs'. Signed-off-by: Fabiano Rosas <farosas@suse.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-ID: <20230901184605.32260-3-farosas@suse.de> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-08 17:03:09 +02:00
Fabiano Rosas	be2e51c503	block: Remove bdrv_query_block_node_info The last call site of this function has been removed by commit `c04d0ab026` ("qemu-img: Let info print block graph"). Reviewed-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Fabiano Rosas <farosas@suse.de> Message-ID: <20230901184605.32260-2-farosas@suse.de> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-08 17:03:09 +02:00
Dmitry Frolov	a8d99c0e6c	vmdk: Clean up bdrv_open_child() return value check bdrv_open_child() may return NULL. Usually return value is checked for this function. Check for return value is more reliable. Fixes: `24bc15d1f6` ("vmdk: Use BdrvChild instead of BDS for references to extents") Signed-off-by: Dmitry Frolov <frolov@swemel.ru> Message-ID: <20230831125926.796205-1-frolov@swemel.ru> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-08 17:03:09 +02:00
Peter Maydell	7966a36c83	block/iscsi: Document why we use raw malloc() In block/iscsi.c we use a raw malloc() call, which is unusual given the project standard is to use the glib memory allocation functions. Document why we do so, to avoid it being converted to g_malloc() by mistake. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-ID: <20230727150705.2664464-1-peter.maydell@linaro.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-08 17:03:09 +02:00
Stefan Hajnoczi	fa9185fcdf	block: change reqs_lock to QemuMutex CoMutex has poor performance when lock contention is high. The tracked requests list is accessed frequently and performance suffers in QEMU multi-queue block layer scenarios. It is not necessary to use CoMutex for the requests lock. The lock is always released across coroutine yield operations. It is held for relatively short periods of time and it is not beneficial to yield when the lock is held by another coroutine. Change the lock type from CoMutex to QemuMutex to improve multi-queue block layer performance. fio randread bs=4k iodepth=64 with 4 IOThreads handling a virtio-blk device with 8 virtqueues improves from 254k to 517k IOPS (+203%). Full benchmark results and configuration details are available here: `980c40845d` In the future we may wish to introduce thread-local tracked requests lists to avoid lock contention completely. That would be much more involved though. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230808155852.2745350-3-stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-08 17:03:09 +02:00
Stefan Hajnoczi	3480ce69a9	block: minimize bs->reqs_lock section in tracked_request_end() Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230808155852.2745350-2-stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-08 17:03:09 +02:00
Philippe Mathieu-Daudé	3c2c599c79	block/vpc: Avoid dynamic stack allocation Use autofree heap allocation instead of variable-length array on the stack. Here we don't expect the bitmap size to be enormous, and since we're about to read/write it to disk the overhead of the allocation should be fine. The codebase has very few VLAs, and if we can get rid of them all we can make the compiler error on new additions. This is a defensive measure against security bugs where an on-stack dynamic allocation isn't correctly size-checked (e.g. CVE-2021-3527). Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> [PMM: expanded commit message] Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-ID: <20230811175229.808139-1-peter.maydell@linaro.org> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-09-08 17:03:09 +02:00
Stefan Hajnoczi	c5ea91da44	trivial patches for 2023-09-08 -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmT68tMPHG1qdEB0bHMu bXNrLnJ1AAoJEHAbT2saaT5ZbEwH/2XcX1f4KcEJbgUn0JVhGQ5GH2c2jepZlkTZ 2dhvdEECbOPMg73hty0fyyWlyuLWdJ9cMpONfMtzmHTH8RKEOAbpn/zusyo3H+48 6cunyUpBqbmb7MHPchrN+JmvtvaSPSazsj2Zdkh+Y4WlfEYj+yVysQ4zQlBlRyHv iOTi6OdjxXg1QcbtJxAUhp+tKaRJzagiCpLkoyW2m8DIuV9cLVHMJsE3OMgfKNgK /S+O1fLcaDhuSCrHAbZzArF3Tr4bfLqSwDtGCJfQpqKeIQDJuI+41GLIlm1nYY70 IFJzEWMOrX/rcMG1CQnUFZOOyDSO+NfILwNnU+eyM49MUekmY54= =mmPS -----END PGP SIGNATURE----- Merge tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu into staging trivial patches for 2023-09-08 # -----BEGIN PGP SIGNATURE----- # # iQFDBAABCAAtFiEEe3O61ovnosKJMUsicBtPaxppPlkFAmT68tMPHG1qdEB0bHMu # bXNrLnJ1AAoJEHAbT2saaT5ZbEwH/2XcX1f4KcEJbgUn0JVhGQ5GH2c2jepZlkTZ # 2dhvdEECbOPMg73hty0fyyWlyuLWdJ9cMpONfMtzmHTH8RKEOAbpn/zusyo3H+48 # 6cunyUpBqbmb7MHPchrN+JmvtvaSPSazsj2Zdkh+Y4WlfEYj+yVysQ4zQlBlRyHv # iOTi6OdjxXg1QcbtJxAUhp+tKaRJzagiCpLkoyW2m8DIuV9cLVHMJsE3OMgfKNgK # /S+O1fLcaDhuSCrHAbZzArF3Tr4bfLqSwDtGCJfQpqKeIQDJuI+41GLIlm1nYY70 # IFJzEWMOrX/rcMG1CQnUFZOOyDSO+NfILwNnU+eyM49MUekmY54= # =mmPS # -----END PGP SIGNATURE----- # gpg: Signature made Fri 08 Sep 2023 06:09:23 EDT # gpg: using RSA key 7B73BAD68BE7A2C289314B22701B4F6B1A693E59 # gpg: issuer "mjt@tls.msk.ru" # gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>" [full] # gpg: aka "Michael Tokarev <mjt@corpit.ru>" [full] # gpg: aka "Michael Tokarev <mjt@debian.org>" [full] # Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D 4324 457C E0A0 8044 65C5 # Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931 4B22 701B 4F6B 1A69 3E59 * tag 'pull-trivial-patches' of https://gitlab.com/mjt0k/qemu: (22 commits) qxl: don't assert() if device isn't yet initialized hw/net/vmxnet3: Fix guest-triggerable assert() tests/qtest/usb-hcd: Remove the empty "init" tests target/ppc: use g_free() in test_opcode_table() hw/ppc: use g_free() in spapr_tce_table_post_load() trivial: Simplify the spots that use TARGET_BIG_ENDIAN as a numeric value accel/tcg: Fix typo in translator_io_start() description tests/qtest/test-hmp: Fix migrate_set_parameter xbzrle-cache-size test docs tests: Fix use of migrate_set_parameter qemu-options.hx: Rephrase the descriptions of the -hd* and -cdrom options hw/display/xlnx_dp: update comments block: spelling fixes misc/other: spelling fixes qga/: spelling fixes tests/: spelling fixes scripts/: spelling fixes include/: spelling fixes audio: spelling fixes xen: spelling fix riscv: spelling fixes ... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-09-08 10:06:25 -04:00
Michael Tokarev	3202d8e404	block: spelling fixes Signed-off-by: Michael Tokarev <mjt@tls.msk.ru> Reviewed-by: Eric Blake <eblake@redhat.com>	2023-09-08 13:08:52 +03:00
Stefan Hajnoczi	06e0f098d6	io: follow coroutine AioContext in qio_channel_yield() The ongoing QEMU multi-queue block layer effort makes it possible for multiple threads to process I/O in parallel. The nbd block driver is not compatible with the multi-queue block layer yet because QIOChannel cannot be used easily from coroutines running in multiple threads. This series changes the QIOChannel API to make that possible. In the current API, calling qio_channel_attach_aio_context() sets the AioContext where qio_channel_yield() installs an fd handler prior to yielding: qio_channel_attach_aio_context(ioc, my_ctx); ... qio_channel_yield(ioc); // my_ctx is used here ... qio_channel_detach_aio_context(ioc); This API design has limitations: reading and writing must be done in the same AioContext and moving between AioContexts involves a cumbersome sequence of API calls that is not suitable for doing on a per-request basis. There is no fundamental reason why a QIOChannel needs to run within the same AioContext every time qio_channel_yield() is called. QIOChannel only uses the AioContext while inside qio_channel_yield(). The rest of the time, QIOChannel is independent of any AioContext. In the new API, qio_channel_yield() queries the AioContext from the current coroutine using qemu_coroutine_get_aio_context(). There is no need to explicitly attach/detach AioContexts anymore and qio_channel_attach_aio_context() and qio_channel_detach_aio_context() are gone. One coroutine can read from the QIOChannel while another coroutine writes from a different AioContext. This API change allows the nbd block driver to use QIOChannel from any thread. It's important to keep in mind that the block driver already synchronizes QIOChannel access and ensures that two coroutines never read simultaneously or write simultaneously. This patch updates all users of qio_channel_attach_aio_context() to the new API. Most conversions are simple, but vhost-user-server requires a new qemu_coroutine_yield() call to quiesce the vu_client_trip() coroutine when not attached to any AioContext. While the API is has become simpler, there is one wart: QIOChannel has a special case for the iohandler AioContext (used for handlers that must not run in nested event loops). I didn't find an elegant way preserve that behavior, so I added a new API called qio_channel_set_follow_coroutine_ctx(ioc, true\|false) for opting in to the new AioContext model. By default QIOChannel uses the iohandler AioHandler. Code that formerly called qio_channel_attach_aio_context() now calls qio_channel_set_follow_coroutine_ctx(ioc, true) once after the QIOChannel is created. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20230830224802.493686-5-stefanha@redhat.com> [eblake: also fix migration/rdma.c] Signed-off-by: Eric Blake <eblake@redhat.com>	2023-09-07 20:32:11 -05:00
Alexander Ivanov	c89d4362dc	parallels: Add data_off repairing to parallels_open() Place data_start/data_end calculation after reading the image header to s->header. Set s->data_start to the offset calculated in parallels_test_data_off(). Call bdrv_check() if data_off is incorrect. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>	2023-09-06 17:36:49 +02:00
Alexander Ivanov	8cd19203f4	parallels: Add data_off check data_off field of the parallels image header can be corrupted. Check if this field greater than the header + BAT size and less than file size. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>	2023-09-06 17:36:49 +02:00
Alexander Ivanov	7c5f86d5ff	parallels: Use bdrv_co_getlength() in parallels_check_outside_image() bdrv_co_getlength() should be used in coroutine context. Replace bdrv_getlength() by bdrv_co_getlength() in parallels_check_outside_image(). Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>	2023-09-06 17:36:49 +02:00
Alexander Ivanov	cfce1091d5	parallels: Image repairing in parallels_open() Repair an image at opening if the image is unclean or out-of-image corruption was detected. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>	2023-09-06 17:36:49 +02:00
Alexander Ivanov	6bb8bc6367	parallels: Add checking and repairing duplicate offsets in BAT Cluster offsets must be unique among all the BAT entries. Find duplicate offsets in the BAT and fix it by copying the content of the relevant cluster to a newly allocated cluster and set the new cluster offset to the duplicated entry. Add host_cluster_index() helper to deduplicate the code. When new clusters are allocated, the file size increases by 128 Mb. Call parallels_check_leak() to fix this leak. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>	2023-09-06 17:36:49 +02:00
Alexander Ivanov	c0b154533e	parallels: Add data_start field to BDRVParallelsState In the next patch we will need the offset of the data area for host cluster index calculation. Add this field and setting up code. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>	2023-09-06 17:36:49 +02:00
Alexander Ivanov	728e10173b	parallels: Add "explicit" argument to parallels_check_leak() In the on of the next patches we need to repair leaks without changing leaks and leaks_fixed info in res. Also we don't want to print any warning about leaks. Add "explicit" argument to skip info changing if the argument is false. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>	2023-09-06 17:36:49 +02:00
Alexander Ivanov	09eb64f9e3	parallels: Check if data_end greater than the file size Initially data_end is set to the data_off image header field and must not be greater than the file size. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>	2023-09-06 17:36:49 +02:00
Alexander Ivanov	fcadb48662	parallels: Incorrect data end calculation in parallels_open() The BDRVParallelsState structure contains data_end field that is measured in sectors. In parallels_open() initially this field is set by data_off field from parallels image header. According to the parallels format documentation, data_off field contains an offset, in sectors, from the start of the file to the start of the data area. For "WithoutFreeSpace" images: if data_off is zero, the offset is calculated as the end of the BAT table plus some padding to ensure sector size alignment. The parallels_open() function has code for handling zero value in data_off, but in the result data_end contains the offset in bytes. Replace the alignment to sector size by division by sector size and fix the comparision with s->header_size. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Denis V. Lunev <den@openvz.org>	2023-09-06 17:36:49 +02:00
Alexander Ivanov	a338dcbbab	parallels: Fix comments formatting inside parallels driver This patch is technically necessary as git patch rendering could result in moving some code from one place to the another and that hits checkpatch.pl warning. This problem specifically happens within next series. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Denis V. Lunev <den@openvz.org>	2023-09-06 17:36:49 +02:00
Andrey Drobyshev	fc6b211f92	block/io: align requests to subcluster_size When target image is using subclusters, and we align the request during copy-on-read, it makes sense to align to subcluster_size rather than cluster_size. Otherwise we end up with unnecessary allocations. This commit renames bdrv_round_to_clusters() to bdrv_round_to_subclusters() and utilizes subcluster_size field of BlockDriverInfo to make necessary alignments. It affects copy-on-read as well as mirror job (which is using bdrv_round_to_clusters()). This change also fixes the following bug with failing assert (covered by the test in the subsequent commit): qemu-img create -f qcow2 base.qcow2 64K qemu-img create -f qcow2 -o extended_l2=on,backing_file=base.qcow2,backing_fmt=qcow2 img.qcow2 64K qemu-io -c "write -P 0xaa 0 2K" img.qcow2 qemu-io -C -c "read -P 0x00 2K 62K" img.qcow2 qemu-io: ../block/io.c:1236: bdrv_co_do_copy_on_readv: Assertion `skip_bytes < pnum' failed. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230711172553.234055-3-andrey.drobyshev@virtuozzo.com>	2023-08-30 07:39:10 -04:00
Andrey Drobyshev	c54483b6f4	block: add subcluster_size field to BlockDriverInfo This is going to be used in the subsequent commit as requests alignment (in particular, during copy-on-read). This value only makes sense for the formats which support subclusters (currently QCOW2 only). If this field isn't set by driver's own bdrv_get_info() implementation, we simply set it equal to the cluster size thus treating each cluster as having a single subcluster. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Andrey Drobyshev <andrey.drobyshev@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-ID: <20230711172553.234055-2-andrey.drobyshev@virtuozzo.com>	2023-08-30 07:39:10 -04:00
Hanna Czenczek	d31b50a15d	file-posix: Simplify raw_co_prw's 'out' zone code We duplicate the same condition three times here, pull it out to the top level. Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230824155345.109765-5-hreitz@redhat.com> Reviewed-by: Sam Li <faithilikerun@gmail.com>	2023-08-29 10:51:04 +02:00
Hanna Czenczek	deab5c9a4e	file-posix: Fix zone update in I/O error path We must check that zone information is present before running update_zones_wp(). Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2234374 Fixes: Coverity CID 1512459 Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230824155345.109765-4-hreitz@redhat.com> Reviewed-by: Sam Li <faithilikerun@gmail.com>	2023-08-29 10:50:49 +02:00
Hanna Czenczek	4b5d80f3d0	file-posix: Check bs->bl.zoned for zone info Instead of checking bs->wps or bs->bl.zone_size for whether zone information is present, check bs->bl.zoned. That is the flag that raw_refresh_zoned_limits() reliably sets to indicate zone support. If it is set to something other than BLK_Z_NONE, other values and objects like bs->wps and bs->bl.zone_size must be non-null/zero and valid; if it is not, we cannot rely on their validity. Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230824155345.109765-3-hreitz@redhat.com> Reviewed-by: Sam Li <faithilikerun@gmail.com>	2023-08-29 10:50:38 +02:00
Hanna Czenczek	56d1a022a7	file-posix: Clear bs->bl.zoned on error bs->bl.zoned is what indicates whether the zone information is present and valid; it is the only thing that raw_refresh_zoned_limits() sets if CONFIG_BLKZONED is not defined, and it is also the only thing that it sets if CONFIG_BLKZONED is defined, but there are no zones. Make sure that it is always set to BLK_Z_NONE if there is an error anywhere in raw_refresh_zoned_limits() so that we do not accidentally announce zones while our information is incomplete or invalid. This also fixes a memory leak in the last error path in raw_refresh_zoned_limits(). Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230824155345.109765-2-hreitz@redhat.com> Reviewed-by: Sam Li <faithilikerun@gmail.com>	2023-08-29 10:49:58 +02:00
zhenwei pi	3b2337eff0	block/throttle-groups: Use ThrottleDirection instread of bool is_write 'bool is_write' style is obsolete from throttle framework, adapt block throttle groups to the new style: - use ThrottleDirection instead of 'bool is_write'. Ex, schedule_next_request(ThrottleGroupMember tgm, bool is_write) -> schedule_next_request(ThrottleGroupMember tgm, ThrottleDirection direction) - use THROTTLE_MAX instead of hard code. Ex, ThrottleGroupMember tokens[2] -> ThrottleGroupMember tokens[THROTTLE_MAX] - use ThrottleDirection instead of hard code on iteration. Ex, (i = 0; i < 2; i++) -> for (dir = THROTTLE_READ; dir < THROTTLE_MAX; dir++) Use a simple python script to test the new style: #!/usr/bin/python3 import subprocess import random import time commands = ['virsh blkdeviotune jammy vda --write-bytes-sec ', \ 'virsh blkdeviotune jammy vda --write-iops-sec ', \ 'virsh blkdeviotune jammy vda --read-bytes-sec ', \ 'virsh blkdeviotune jammy vda --read-iops-sec '] for loop in range(1, 1000): time.sleep(random.randrange(3, 5)) command = commands[random.randrange(0, 3)] + str(random.randrange(0, 1000000)) subprocess.run(command, shell=True, check=True) This works fine. Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20230728022006.1098509-10-pizhenwei@bytedance.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-08-29 10:49:24 +02:00
zhenwei pi	e76f201f69	throttle: use enum ThrottleDirection instead of bool is_write enum ThrottleDirection is already there, use ThrottleDirection instead of 'bool is_write' for throttle API, also modify related codes from block, fsdev, cryptodev and tests. Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Message-Id: <20230728022006.1098509-7-pizhenwei@bytedance.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-08-29 10:49:24 +02:00
Stefano Garzarella	9b06d0d076	block/blkio: add more comments on the fd passing handling As Hanna pointed out, it is not clear in the code why qemu_open() can fail, and why blkio_set_int("fd") is not enough to discover the `fd` property support. Let's fix them by adding more details in the code comments. Suggested-by: Hanna Czenczek <hreitz@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20230803082825.25293-3-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-08-03 11:28:43 -04:00
Stefano Garzarella	0b054b4c82	block/blkio: close the fd when blkio_connect() fails libblkio drivers take ownership of `fd` only after a successful blkio_connect(), so if it fails, we are still the owners. Fixes: `cad2ccc395` ("block/blkio: use qemu_open() to support fd passing for virtio-blk") Suggested-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-id: 20230803082825.25293-2-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-08-03 11:28:43 -04:00
Richard Henderson	f33c745764	Pull request Please include these bug fixes in QEMU 8.1. Thanks! -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmTCzPUACgkQnKSrs4Gr c8g1DAf/fPUQ4zRsCn079pHIyK9TFo4COm23p4kiusxj8otfjt8LH1Zsc9pGWC2+ bl2RlnPID8JlyJFDRN7b/RCEhj45a83GtCmhDDmqVgy1eO5vwOKm2XyyWeD+pq/U Hf2QLPLZZ7tCD8Njpty+gB3Ux4zqthKGXSg8FpJ3w0tl4me2efLvjMa6jHMwtnHT aAbyQ3WMpT9w4XHLqRQDHzBqrTSY4od3nl9SrM/DQ2klLIcz8ECTEZVBY9B3pq6m QvAg24tfb0QvS14YnZv/PMCfOaVuE87M9G4f93pCynnMxMYze+XczL0sGhIAS9wp 03NgGlhGumOix6r2kHjlG6p3xywV8A== =jMf8 -----END PGP SIGNATURE----- Merge tag 'block-pull-request' of https://gitlab.com/stefanha/qemu into staging Pull request Please include these bug fixes in QEMU 8.1. Thanks! # -----BEGIN PGP SIGNATURE----- # # iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmTCzPUACgkQnKSrs4Gr # c8g1DAf/fPUQ4zRsCn079pHIyK9TFo4COm23p4kiusxj8otfjt8LH1Zsc9pGWC2+ # bl2RlnPID8JlyJFDRN7b/RCEhj45a83GtCmhDDmqVgy1eO5vwOKm2XyyWeD+pq/U # Hf2QLPLZZ7tCD8Njpty+gB3Ux4zqthKGXSg8FpJ3w0tl4me2efLvjMa6jHMwtnHT # aAbyQ3WMpT9w4XHLqRQDHzBqrTSY4od3nl9SrM/DQ2klLIcz8ECTEZVBY9B3pq6m # QvAg24tfb0QvS14YnZv/PMCfOaVuE87M9G4f93pCynnMxMYze+XczL0sGhIAS9wp # 03NgGlhGumOix6r2kHjlG6p3xywV8A== # =jMf8 # -----END PGP SIGNATURE----- # gpg: Signature made Thu 27 Jul 2023 01:00:53 PM PDT # gpg: using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full] # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" [full] * tag 'block-pull-request' of https://gitlab.com/stefanha/qemu: block/blkio: use blkio_set_int("fd") to check fd support block/blkio: fall back on using `path` when `fd` setting fails block/blkio: retry blkio_connect() if it fails using `fd` block/blkio: move blkio_connect() in the drivers functions block: Fix pad_request's request restriction block/file-posix: fix g_file_get_contents return path block/blkio: do not use open flags in qemu_open() block/blkio: enable the completion eventfd Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-07-27 17:41:55 -07:00
Stefano Garzarella	1c38fe69e2	block/blkio: use blkio_set_int("fd") to check fd support Setting the `fd` property fails with virtio-blk-* libblkio drivers that do not support fd passing since https://gitlab.com/libblkio/libblkio/-/merge_requests/208. Getting the `fd` property, on the other hand, always succeeds for virtio-blk-* libblkio drivers even when they don't support fd passing. This patch switches to setting the `fd` property because it is a better mechanism for probing fd passing support than getting the `fd` property. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20230727161020.84213-5-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-07-27 15:51:46 -04:00
Stefano Garzarella	723bea27b1	block/blkio: fall back on using `path` when `fd` setting fails qemu_open() fails if called with an unix domain socket in this way: -blockdev node-name=drive0,driver=virtio-blk-vhost-user,path=vhost-user-blk.sock,cache.direct=on: Could not open 'vhost-user-blk.sock': No such device or address Since virtio-blk-vhost-user does not support fd passing, let`s always fall back on using `path` if we fail the fd passing. Fixes: `cad2ccc395` ("block/blkio: use qemu_open() to support fd passing for virtio-blk") Reported-by: Qing Wang <qinwang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20230727161020.84213-4-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-07-27 15:51:46 -04:00
Stefano Garzarella	809c319f8a	block/blkio: retry blkio_connect() if it fails using `fd` libblkio 1.3.0 added support of "fd" property for virtio-blk-vhost-vdpa driver. In QEMU, starting from commit `cad2ccc395` ("block/blkio: use qemu_open() to support fd passing for virtio-blk") we are using `blkio_get_int(..., "fd")` to check if the "fd" property is supported for all the virtio-blk-* driver. Unfortunately that property is also available for those driver that do not support it, such as virtio-blk-vhost-user. So, `blkio_get_int()` is not enough to check whether the driver supports the `fd` property or not. This is because the virito-blk common libblkio driver only checks whether or not `fd` is set during `blkio_connect()` and fails with -EINVAL for those transports that do not support it (all except vhost-vdpa for now). So let's handle the `blkio_connect()` failure, retrying it using `path` directly. Fixes: `cad2ccc395` ("block/blkio: use qemu_open() to support fd passing for virtio-blk") Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20230727161020.84213-3-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-07-27 15:51:46 -04:00
Stefano Garzarella	69785d66ae	block/blkio: move blkio_connect() in the drivers functions This is in preparation for the next patch, where for virtio-blk drivers we need to handle the failure of blkio_connect(). Let's also rename the _open() functions to _connect() to make the code reflect the changes applied. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20230727161020.84213-2-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-07-27 15:51:46 -04:00
Hanna Czenczek	ef256751e9	block: Fix pad_request's request restriction bdrv_pad_request() relies on requests' lengths not to exceed SIZE_MAX, which bdrv_check_qiov_request() does not guarantee. bdrv_check_request32() however will guarantee this, and both of bdrv_pad_request()'s callers (bdrv_co_preadv_part() and bdrv_co_pwritev_part()) already run it before calling bdrv_pad_request(). Therefore, bdrv_pad_request() can safely call bdrv_check_request32() without expecting error, too. In effect, this patch will not change guest-visible behavior. It is a clean-up to tighten a condition to match what is guaranteed by our callers, and which exists purely to show clearly why the subsequent assertion (`assert(*bytes <= SIZE_MAX)`) is always true. Note there is a difference between the interfaces of bdrv_check_qiov_request() and bdrv_check_request32(): The former takes an errp, the latter does not, so we can no longer just pass &error_abort. Instead, we need to check the returned value. While we do expect success (because the callers have already run this function), an assert(ret == 0) is not much simpler than just to return an error if it occurs, so let us handle errors by returning them up the stack now. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-id: 20230714085938.202730-1-hreitz@redhat.com Fixes: `18743311b8` ("block: Collapse padded I/O vecs exceeding IOV_MAX") Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-07-27 09:48:16 -04:00
Sam Li	29a242e165	block/file-posix: fix g_file_get_contents return path The g_file_get_contents() function returns a g_boolean. If it fails, the returned value will be 0 instead of -1. Solve the issue by skipping assigning ret value. This issue was found by Matthew Rosato using virtio-blk-{pci,ccw} backed by an NVMe partition e.g. /dev/nvme0n1p1 on s390x. Signed-off-by: Sam Li <faithilikerun@gmail.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-id: 20230727115844.8480-1-faithilikerun@gmail.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-07-27 09:46:09 -04:00
Stefano Garzarella	a5942c177b	block/blkio: do not use open flags in qemu_open() qemu_open() in blkio_virtio_blk_common_open() is used to open the character device (e.g. /dev/vhost-vdpa-0 or /dev/vfio/vfio) or in the future eventually the unix socket. In all these cases we cannot open the path in read-only mode, when the `read-only` option of blockdev is on, because the exchange of IOCTL commands for example will fail. In order to open the device read-only, we have to use the `read-only` property of the libblkio driver as we already do in blkio_file_open(). Fixes: `cad2ccc395` ("block/blkio: use qemu_open() to support fd passing for virtio-blk") Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2225439 Reported-by: Qing Wang <qinwang@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-id: 20230726074807.14041-1-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-07-26 11:29:39 -04:00
Stefano Garzarella	9359c45988	block/blkio: enable the completion eventfd Until libblkio 1.3.0, virtio-blk drivers had completion eventfd notifications enabled from the start, but from the next releases this is no longer the case, so we have to explicitly enable them. In fact, the libblkio documentation says they could be disabled, so we should always enable them at the start if we want to be sure to get completion eventfd notifications: By default, the driver might not generate completion events for requests so it is necessary to explicitly enable the completion file descriptor before use: void blkioq_set_completion_fd_enabled(struct blkioq *q, bool enable); I discovered this while trying a development version of libblkio: the guest kernel hangs during boot, while probing the device. Fixes: `fd66dbd424` ("blkio: add libblkio block driver") Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20230725103744.77343-1-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-07-25 15:34:17 -04:00
Eric Blake	8cb98a725e	nbd/client: Simplify cookie vs. index computation Our code relies on a sentinel cookie value of zero for deciding when a packet has been handled, as well as relying on array indices between 0 and MAX_NBD_REQUESTS-1 for dereferencing purposes. As long as we can symmetrically convert between two forms, there is no reason to go with the odd choice of using XOR with a random pointer, when we can instead simplify the mappings with a mere offset of 1. Using ((uint64_t)-1) as the sentinel instead of NULL such that the two macros could be entirely eliminated might also be possible, but would require a more careful audit to find places where we currently rely on zero-initialization to be interpreted as the sentinel value, so I did not pursue that course. Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230608135653.2918540-7-eblake@redhat.com> [eblake: enhance commit message] Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>	2023-07-19 15:26:13 -05:00
Eric Blake	22efd81104	nbd: s/handle/cookie/ to match NBD spec Externally, libnbd exposed the 64-bit opaque marker for each client NBD packet as the "cookie", because it was less confusing when contrasted with 'struct nbd_handle *' holding all libnbd state. It also avoids confusion between the noun 'handle' as a way to identify a packet and the verb 'handle' for reacting to things like signals. Upstream NBD changed their spec to favor the name "cookie" based on libnbd's recommendations[1], so we can do likewise. [1] https://github.com/NetworkBlockDevice/nbd/commit/ca4392eb2b Signed-off-by: Eric Blake <eblake@redhat.com> Message-ID: <20230608135653.2918540-6-eblake@redhat.com> [eblake: typo fix] Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>	2023-07-19 15:25:30 -05:00
Stefan Hajnoczi	66547f416a	block/nvme: invoke blk_io_plug_call() outside q->lock blk_io_plug_call() is invoked outside a blk_io_plug()/blk_io_unplug() section while opening the NVMe drive from: nvme_file_open() -> nvme_init() -> nvme_identify() -> nvme_admin_cmd_sync() -> nvme_submit_command() -> blk_io_plug_call() blk_io_plug_call() immediately invokes the given callback when the current thread is not plugged, as is the case during nvme_file_open(). Unfortunately, nvme_submit_command() calls blk_io_plug_call() with q->lock still held: ... q->sq.tail = (q->sq.tail + 1) % NVME_QUEUE_SIZE; q->need_kick++; blk_io_plug_call(nvme_unplug_fn, q); qemu_mutex_unlock(&q->lock); ^^^^^^^^^^^^^^^^^^^^^^^^^^^ nvme_unplug_fn() deadlocks trying to acquire q->lock because the lock is already acquired by the same thread. The symptom is that QEMU hangs during startup while opening the NVMe drive. Fix this by moving the blk_io_plug_call() outside q->lock. This is safe because no other thread runs code related to this queue and blk_io_plug_call()'s internal state is immune to thread safety issues since it is thread-local. Reported-by: Lukáš Doktor <ldoktor@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Lukas Doktor <ldoktor@redhat.com> Message-id: 20230712191628.252806-1-stefanha@redhat.com Fixes: `f2e590002b` ("block/nvme: convert to blk_io_plug_call() API") Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-07-17 09:17:41 -04:00
Stefan Hajnoczi	c21eae1ccc	block/blkio: fix module_block.py parsing When QEMU is built with --enable-modules, the module_block.py script parses block/*.c to find block drivers that are built as modules. The script generates a table of block drivers called block_driver_modules[]. This table is used for block driver module loading. The blkio.c driver uses macros to define its BlockDriver structs. This was done to avoid code duplication but the module_block.py script is unable to parse the macro. The result is that libblkio-based block drivers can be built as modules but will not be found at runtime. One fix is to make the module_block.py script or build system fancier so it can parse C macros (e.g. by parsing the preprocessed source code). I chose not to do this because it raises the complexity of the build, making future issues harder to debug. Keep things simple: use the macro to avoid duplicating BlockDriver function pointers but define .format_name and .protocol_name manually for each BlockDriver. This way the module_block.py is able to parse the code. Also get rid of the block driver name macros (e.g. DRIVER_IO_URING) because module_block.py cannot parse them either. Fixes: `fd66dbd424` ("blkio: add libblkio block driver") Reported-by: Qing Wang <qinwang@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20230704123436.187761-1-stefanha@redhat.com Cc: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-07-04 17:28:25 +02:00
Richard Henderson	0eb8f90ede	Block layer patches - Re-enable the graph lock - More fixes to coroutine_fn marking -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmScQCQRHGt3b2xmQHJl ZGhhdC5jb20ACgkQfwmycsiPL9bNSA//WIzPT45rFhl2U9QgyOJu26ho6ahsgwgI Z3QM5kCDB1dAN9USRPxhGboLGo8CyY7eeSwSrR7RtwBGYrWrAoJfGp5gK/7d9s5Q o0AGgRPnJGhFkBhRRMytsDsewM6Kk4IRmk4HMK3cOH3rsSM8RHs6KmDSBKesllu0 QVGf3qW4u8LHyZyGM5OlPVUbtuDuK6/52FGhpXBp+x4oyNegOhjwO4mGOvTG+xIk Q5zwWZaPfjxaEDkvW8iahB6/D7Tpt64BmMf1Ydhxcd5eKEp932CiBI36aAlNKoRD Al5wztRx1GEh12ekN39jIi7Ypp3JX26keJcieKU0q656pT551UFRYjU0Rk08/Cca qv2oiQDu6bHgQ9zCQ1nMfa9+K2MyBwx0b5qfYkvs2RzgCTl8ImgBQANHfw8tz6Bq HUo1zsFBXCaK0boUB5iFwdf3rlx3t9UTEuDej/RaHqZjZD5xeG/smCcOlSfHaKUa wXfYxvm8ZfefJn1D6io1A+7M956uvIQNtmh13cU44clgFX9Y/bBNMg/5lMRsJKo8 xxjvqCAyxo/pPfUsVWx4pc8AXbfVa85gyoSiaLEYZnqP54sJ2lFccqykCsTy58Lo VDcoPnoSc+LNqBOvtzxXgQbEWFCXU6fe0+TZgVYUvExWFIAOImeDWg2GD1JVrwsX e9QrPhL3DXg= =ZQcP -----END PGP SIGNATURE----- Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging Block layer patches - Re-enable the graph lock - More fixes to coroutine_fn marking # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmScQCQRHGt3b2xmQHJl # ZGhhdC5jb20ACgkQfwmycsiPL9bNSA//WIzPT45rFhl2U9QgyOJu26ho6ahsgwgI # Z3QM5kCDB1dAN9USRPxhGboLGo8CyY7eeSwSrR7RtwBGYrWrAoJfGp5gK/7d9s5Q # o0AGgRPnJGhFkBhRRMytsDsewM6Kk4IRmk4HMK3cOH3rsSM8RHs6KmDSBKesllu0 # QVGf3qW4u8LHyZyGM5OlPVUbtuDuK6/52FGhpXBp+x4oyNegOhjwO4mGOvTG+xIk # Q5zwWZaPfjxaEDkvW8iahB6/D7Tpt64BmMf1Ydhxcd5eKEp932CiBI36aAlNKoRD # Al5wztRx1GEh12ekN39jIi7Ypp3JX26keJcieKU0q656pT551UFRYjU0Rk08/Cca # qv2oiQDu6bHgQ9zCQ1nMfa9+K2MyBwx0b5qfYkvs2RzgCTl8ImgBQANHfw8tz6Bq # HUo1zsFBXCaK0boUB5iFwdf3rlx3t9UTEuDej/RaHqZjZD5xeG/smCcOlSfHaKUa # wXfYxvm8ZfefJn1D6io1A+7M956uvIQNtmh13cU44clgFX9Y/bBNMg/5lMRsJKo8 # xxjvqCAyxo/pPfUsVWx4pc8AXbfVa85gyoSiaLEYZnqP54sJ2lFccqykCsTy58Lo # VDcoPnoSc+LNqBOvtzxXgQbEWFCXU6fe0+TZgVYUvExWFIAOImeDWg2GD1JVrwsX # e9QrPhL3DXg= # =ZQcP # -----END PGP SIGNATURE----- # gpg: Signature made Wed 28 Jun 2023 04:13:56 PM CEST # gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6 # gpg: issuer "kwolf@redhat.com" # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full] * tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (23 commits) block: use bdrv_co_debug_event in coroutine context block: use bdrv_co_getlength in coroutine context qcow2: mark more functions as coroutine_fns and GRAPH_RDLOCK vhdx: mark more functions as coroutine_fns and GRAPH_RDLOCK vmdk: mark more functions as coroutine_fns and GRAPH_RDLOCK dmg: mark more functions as coroutine_fns and GRAPH_RDLOCK cloop: mark more functions as coroutine_fns and GRAPH_RDLOCK block: mark another function as coroutine_fns and GRAPH_UNLOCKED bochs: mark more functions as coroutine_fns and GRAPH_RDLOCK vpc: mark more functions as coroutine_fns and GRAPH_RDLOCK qed: mark more functions as coroutine_fns and GRAPH_RDLOCK file-posix: remove incorrect coroutine_fn calls Revert "graph-lock: Disable locking for now" graph-lock: Unlock the AioContext while polling blockjob: Fix AioContext locking in block_job_add_bdrv() block: Fix AioContext locking in bdrv_open_backing_file() block: Fix AioContext locking in bdrv_open_inherit() block: Fix AioContext locking in bdrv_reopen_parse_file_or_backing() block: Fix AioContext locking in bdrv_attach_child_common() block: Fix AioContext locking in bdrv_open_child() ... Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-06-28 17:29:53 +02:00
Paolo Bonzini	17362398ee	block: use bdrv_co_debug_event in coroutine context bdrv_co_debug_event was recently introduced, with bdrv_debug_event becoming a wrapper for use in unknown context. Because most of the time bdrv_debug_event is used on a BdrvChild via the wrapper macro BLKDBG_EVENT, introduce a similar macro BLKDBG_CO_EVENT that calls bdrv_co_debug_event, and switch whenever possible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-13-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 09:46:34 +02:00
Paolo Bonzini	0af02bd107	block: use bdrv_co_getlength in coroutine context bdrv_co_getlength was recently introduced, with bdrv_getlength becoming a wrapper for use in unknown context. Switch to bdrv_co_getlength when possible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-12-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 09:46:33 +02:00
Paolo Bonzini	70bacc4453	qcow2: mark more functions as coroutine_fns and GRAPH_RDLOCK Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-11-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 09:46:32 +02:00
Paolo Bonzini	f6b0899493	vhdx: mark more functions as coroutine_fns and GRAPH_RDLOCK Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-10-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 09:46:30 +02:00
Paolo Bonzini	28944f99c4	vmdk: mark more functions as coroutine_fns and GRAPH_RDLOCK Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-9-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 09:46:29 +02:00
Paolo Bonzini	688dc49da5	dmg: mark more functions as coroutine_fns and GRAPH_RDLOCK Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-8-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 09:46:28 +02:00
Paolo Bonzini	cf8d4c582b	cloop: mark more functions as coroutine_fns and GRAPH_RDLOCK Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-7-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 09:46:27 +02:00
Paolo Bonzini	e7918e9619	bochs: mark more functions as coroutine_fns and GRAPH_RDLOCK Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-5-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 09:46:24 +02:00
Paolo Bonzini	517b5dfffd	vpc: mark more functions as coroutine_fns and GRAPH_RDLOCK Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-4-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 09:46:23 +02:00
Paolo Bonzini	bba667da7f	qed: mark more functions as coroutine_fns and GRAPH_RDLOCK Mark functions as coroutine_fn when they are only called by other coroutine_fns and they can suspend. Change calls to co_wrappers to use the non-wrapped functions, which in turn requires adding GRAPH_RDLOCK annotations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-3-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 09:46:20 +02:00
Paolo Bonzini	36c6c8773a	file-posix: remove incorrect coroutine_fn calls raw_co_getlength is called by handle_aiocb_write_zeroes, which is not a coroutine function. This is harmless because raw_co_getlength does not actually suspend, but in the interest of clarity make it a non-coroutine_fn that is just wrapped by the coroutine_fn raw_co_getlength. Likewise, check_cache_dropped was only a coroutine_fn because it called raw_co_getlength, so it can be made non-coroutine as well. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-ID: <20230601115145.196465-2-pbonzini@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 09:46:14 +02:00
Kevin Wolf	3cce22defb	Revert "graph-lock: Disable locking for now" Now that bdrv_graph_wrlock() temporarily drops the AioContext lock that its caller holds, it can poll without causing deadlocks. We can now re-enable graph locking. This reverts commit ad128dff0bf4b6f971d05eb4335a627883a19c1d. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230605085711.21261-12-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 08:46:26 +02:00
Kevin Wolf	31b2ddfea3	graph-lock: Unlock the AioContext while polling If the caller keeps the AioContext lock for a block node in an iothread, polling in bdrv_graph_wrlock() deadlocks if the condition isn't fulfilled immediately. Now that all callers make sure to actually have the AioContext locked when they call bdrv_replace_child_noperm() like they should, we can change bdrv_graph_wrlock() to take a BlockDriverState whose AioContext lock the caller holds (NULL if it doesn't) and unlock it temporarily while polling. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-ID: <20230605085711.21261-11-kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-06-28 08:46:23 +02:00
Manos Pitsidianakis	f8ed3648b5	vhost-user: fully use new backend/frontend naming Slave/master nomenclature was replaced with backend/frontend in commit `1fc19b6527` ("vhost-user: Adopt new backend naming") This patch replaces all remaining uses of master and slave in the codebase. Signed-off-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org> Message-Id: <20230613080849.2115347-1-manos.pitsidianakis@linaro.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>	2023-06-26 09:50:00 -04:00
Philippe Mathieu-Daudé	de6cd7599b	meson: Replace softmmu_ss -> system_ss We use the user_ss[] array to hold the user emulation sources, and the softmmu_ss[] array to hold the system emulation ones. Hold the latter in the 'system_ss[]' array for parity with user emulation. Mechanical change doing: $ sed -i -e s/softmmu_ss/system_ss/g $(git grep -l softmmu_ss) Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20230613133347.82210-10-philmd@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-06-20 10:01:30 +02:00
Richard Henderson	7ce5a15fa6	* Fix emulated LCCB, LOCFHR, MXDB and MXDBR s390x instructions * Fix the malta machine on s390x (big endian) hosts * Emulate /proc/cpuinfo on s390x * Remove pointless QOM casts * Improve the inclusion logic for libkeyutils and ipmi-bt-test in meson.build -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmR+ycgRHHRodXRoQHJl ZGhhdC5jb20ACgkQLtnXdP5wLbWXXw//WPz3ng50KLS+M1t3/ULEjO6XkGfP2LQZ RsZq3hf9THFPZgcREk+6SQvttOSTuvHNfakfujS6U1Ou5thReWqLe4itFW6+hB5j kQ+Sm6YJ+fpezkBnSefcUoL5nA9VVKZ6KE6kxq5CUBZNoIk1sSsfrU8y8wjzW0yg 2nraOcG10aLpO2BfvKHVEAhJtwl9pHJsFANmHC2/h2wC9BZIAzdxiytzdcJ909gN AAa0hIrLK/oFgJjkSSxu+QTaVGPARXqkx5WV546F/zmDMFUWd9nrXaegwqxjgPBN m9Ua0SXll5hX2Z57vjJWlbTYkD+JUB22L0N7p5/xzhYRpLVSq1pdveo9psrzIC3E Bt7chZB58acQepJHxxa3UHDOHcnfdfaN+Dd9wD29wHr7nK8lOcsen7/7V+5YXomc qenkCtkpjKTl07OBxe6MDGZtPZYA8fK1CjEyYwHCe8QvxEzsyg96Bm3j4N2VPxQU +f/sFPX7SgogZI4mB4wdoxOF1RmQ+DXQ2tnB970txZRkmFq2jJHpW86jkkbq2Jl1 KIjgdIXjVgy+MPtuQzO5cT+jfhGQL7FQynGXHjv/UidBid5XD3TDVNa9AthN3Mng +rPT90VJ7j9soMqvmNT1COSIRD+M49dQKBIQuq/gWplaTOHaAcJrCwYScwqe0u0P zmjCNeuPVw8= =dfJr -----END PGP SIGNATURE----- Merge tag 'pull-request-2023-06-06' of https://gitlab.com/thuth/qemu into staging * Fix emulated LCCB, LOCFHR, MXDB and MXDBR s390x instructions * Fix the malta machine on s390x (big endian) hosts * Emulate /proc/cpuinfo on s390x * Remove pointless QOM casts * Improve the inclusion logic for libkeyutils and ipmi-bt-test in meson.build # -----BEGIN PGP SIGNATURE----- # # iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmR+ycgRHHRodXRoQHJl # ZGhhdC5jb20ACgkQLtnXdP5wLbWXXw//WPz3ng50KLS+M1t3/ULEjO6XkGfP2LQZ # RsZq3hf9THFPZgcREk+6SQvttOSTuvHNfakfujS6U1Ou5thReWqLe4itFW6+hB5j # kQ+Sm6YJ+fpezkBnSefcUoL5nA9VVKZ6KE6kxq5CUBZNoIk1sSsfrU8y8wjzW0yg # 2nraOcG10aLpO2BfvKHVEAhJtwl9pHJsFANmHC2/h2wC9BZIAzdxiytzdcJ909gN # AAa0hIrLK/oFgJjkSSxu+QTaVGPARXqkx5WV546F/zmDMFUWd9nrXaegwqxjgPBN # m9Ua0SXll5hX2Z57vjJWlbTYkD+JUB22L0N7p5/xzhYRpLVSq1pdveo9psrzIC3E # Bt7chZB58acQepJHxxa3UHDOHcnfdfaN+Dd9wD29wHr7nK8lOcsen7/7V+5YXomc # qenkCtkpjKTl07OBxe6MDGZtPZYA8fK1CjEyYwHCe8QvxEzsyg96Bm3j4N2VPxQU # +f/sFPX7SgogZI4mB4wdoxOF1RmQ+DXQ2tnB970txZRkmFq2jJHpW86jkkbq2Jl1 # KIjgdIXjVgy+MPtuQzO5cT+jfhGQL7FQynGXHjv/UidBid5XD3TDVNa9AthN3Mng # +rPT90VJ7j9soMqvmNT1COSIRD+M49dQKBIQuq/gWplaTOHaAcJrCwYScwqe0u0P # zmjCNeuPVw8= # =dfJr # -----END PGP SIGNATURE----- # gpg: Signature made Mon 05 Jun 2023 10:53:12 PM PDT # gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5 # gpg: issuer "thuth@redhat.com" # gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [unknown] # gpg: aka "Thomas Huth <thuth@redhat.com>" [unknown] # gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown] # gpg: aka "Thomas Huth <huth@tuxfamily.org>" [unknown] # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5 * tag 'pull-request-2023-06-06' of https://gitlab.com/thuth/qemu: linux-user: Emulate /proc/cpuinfo on s390x linux-user/elfload: Introduce elf_hwcap_str() on s390x linux-user/elfload: Expose get_elf_hwcap() on s390x s390x/tcg: Fix CPU address returned by STIDP bulk: Remove pointless QOM casts scripts: Add qom-cast-macro-clean-cocci-gen.py hw/mips/malta: Fix the malta machine on big endian hosts gitlab-ci: Remove unused Python package tests/qtest: Run ipmi-bt-test only if CONFIG_IPMI_EXTERN is set tests/tcg/s390x: Test MXDB and MXDBR target/s390x: Fix MXDB and MXDBR Add conditional dependency for libkeyutils tests/tcg/s390x: Test single-stepping SVC linux-user/s390x: Fix single-stepping SVC tests/tcg/s390x: Test LOCFHR target/s390x: Fix LOCFHR taking the wrong half of R2 tests/tcg/s390x: Test LCBB target/s390x: Fix LCBB overwriting the top 32 bits Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2023-06-06 07:07:37 -07:00
Philippe Mathieu-Daudé	7d5b0d6864	bulk: Remove pointless QOM casts Mechanical change running Coccinelle spatch with content generated from the qom-cast-macro-clean-cocci-gen.py added in the previous commit. Suggested-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org> Message-Id: <20230601093452.38972-3-philmd@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Thomas Huth <thuth@redhat.com>	2023-06-05 20:48:34 +02:00
Jean-Louis Dupond	42a2890a76	qcow2: add discard-no-unref option When we for example have a sparse qcow2 image and discard: unmap is enabled, there can be a lot of fragmentation in the image after some time. Especially on VM's that do a lot of writes/deletes. This causes the qcow2 image to grow even over 110% of its virtual size, because the free gaps in the image get too small to allocate new continuous clusters. So it allocates new space at the end of the image. Disabling discard is not an option, as discard is needed to keep the incremental backup size as low as possible. Without discard, the incremental backups would become large, as qemu thinks it's just dirty blocks but it doesn't know the blocks are unneeded. So we need to avoid fragmentation but also 'empty' the unneeded blocks in the image to have a small incremental backup. In addition, we also want to send the discards further down the stack, so the underlying blocks are still discarded. Therefor we introduce a new qcow2 option "discard-no-unref". When setting this option to true, discards will no longer have the qcow2 driver relinquish cluster allocations. Other than that, the request is handled as normal: All clusters in range are marked as zero, and, if pass-discard-request is true, it is passed further down the stack. The only difference is that the now-zero clusters are preallocated instead of being unallocated. This will avoid fragmentation on the qcow2 image. Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621 Signed-off-by: Jean-Louis Dupond <jean-louis@dupond.be> Message-Id: <20230605084523.34134-2-jean-louis@dupond.be> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:15:42 +02:00
Alexander Ivanov	029136f261	parallels: Incorrect condition in out-of-image check All the offsets in the BAT must be lower than the file size. Fix the check condition for correct check. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-13-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:13:59 +02:00
Alexander Ivanov	c0fc051dd4	parallels: Replace qemu_co_mutex_lock by WITH_QEMU_LOCK_GUARD Replace the way we use mutex in parallels_co_check() for simplier and less error prone code. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-12-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:13:58 +02:00
Alexander Ivanov	7e259e2540	parallels: Move statistic collection to a separate function We will add more and more checks so we need a better code structure in parallels_co_check. Let each check performs in a separate loop in a separate helper. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230424093147.197643-11-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:13:57 +02:00
Alexander Ivanov	09a21edfaf	parallels: Move check of leaks to a separate function We will add more and more checks so we need a better code structure in parallels_co_check. Let each check performs in a separate loop in a separate helper. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Message-Id: <20230424093147.197643-10-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:13:56 +02:00
Alexander Ivanov	9616f7a6c2	parallels: Fix statistics calculation Exclude out-of-image clusters from allocated and fragmented clusters calculation. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Message-Id: <20230424093147.197643-9-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:13:55 +02:00
Alexander Ivanov	6d416e56a7	parallels: Move check of cluster outside image to a separate function We will add more and more checks so we need a better code structure in parallels_co_check. Let each check performs in a separate loop in a separate helper. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-8-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:13:54 +02:00
Alexander Ivanov	96de69c7d9	parallels: Move check of unclean image to a separate function We will add more and more checks so we need a better code structure in parallels_co_check. Let each check performs in a separate loop in a separate helper. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230424093147.197643-7-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:13:53 +02:00
Alexander Ivanov	3569cb7b78	parallels: Use generic infrastructure for BAT writing in parallels_co_check() BAT is written in the context of conventional operations over the image inside bdrv_co_flush() when it calls parallels_co_flush_to_os() callback. Thus we should not modify BAT array directly, but call parallels_set_bat_entry() helper and bdrv_co_flush() further on. After that there is no need to manually write BAT and track its modification. This makes code more generic and allows to split parallels_set_bat_entry() for independent pieces. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-6-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:13:52 +02:00
Alexander Ivanov	b64b29b96b	parallels: create parallels_set_bat_entry_helper() to assign BAT value This helper will be reused in next patches during parallels_co_check rework to simplify its code. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230424093147.197643-5-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:13:51 +02:00
Alexander Ivanov	679749ce41	parallels: Fix image_end_offset and data_end after out-of-image check Set data_end to the end of the last cluster inside the image. In such a way we can be sure that corrupted offsets in the BAT can't affect on the image size. If there are no allocated clusters set image_end_offset by data_end. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Message-Id: <20230424093147.197643-4-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:13:50 +02:00
Alexander Ivanov	ab2d739c41	parallels: Fix high_off calculation in parallels_co_check() Don't let high_off be more than the file size even if we don't fix the image. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru> Message-Id: <20230424093147.197643-3-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:13:49 +02:00
Alexander Ivanov	f5e715dbbb	parallels: Out of image offset in BAT leads to image inflation data_end field in BDRVParallelsState is set to the biggest offset present in BAT. If this offset is outside of the image, any further write will create the cluster at this offset and/or the image will be truncated to this offset on close. This is definitely not correct. Raise an error in parallels_open() if data_end points outside the image and it is not a check (let the check to repaire the image). Set data_end to the end of the cluster with the last correct offset. Signed-off-by: Alexander Ivanov <alexander.ivanov@virtuozzo.com> Message-Id: <20230424093147.197643-2-alexander.ivanov@virtuozzo.com> Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>	2023-06-05 13:13:47 +02:00
Hanna Czenczek	18743311b8	block: Collapse padded I/O vecs exceeding IOV_MAX When processing vectored guest requests that are not aligned to the storage request alignment, we pad them by adding head and/or tail buffers for a read-modify-write cycle. The guest can submit I/O vectors up to IOV_MAX (1024) in length, but with this padding, the vector can exceed that limit. As of `4c002cef0e` ("util/iov: make qemu_iovec_init_extended() honest"), we refuse to pad vectors beyond the limit, instead returning an error to the guest. To the guest, this appears as a random I/O error. We should not return an I/O error to the guest when it issued a perfectly valid request. Before `4c002cef0e`, we just made the vector longer than IOV_MAX, which generally seems to work (because the guest assumes a smaller alignment than we really have, file-posix's raw_co_prw() will generally see bdrv_qiov_is_aligned() return false, and so emulate the request, so that the IOV_MAX does not matter). However, that does not seem exactly great. I see two ways to fix this problem: 1. We split such long requests into two requests. 2. We join some elements of the vector into new buffers to make it shorter. I am wary of (1), because it seems like it may have unintended side effects. (2) on the other hand seems relatively simple to implement, with hopefully few side effects, so this patch does that. To do this, the use of qemu_iovec_init_extended() in bdrv_pad_request() is effectively replaced by the new function bdrv_create_padded_qiov(), which not only wraps the request IOV with padding head/tail, but also ensures that the resulting vector will not have more than IOV_MAX elements. Putting that functionality into qemu_iovec_init_extended() is infeasible because it requires allocating a bounce buffer; doing so would require many more parameters (buffer alignment, how to initialize the buffer, and out parameters like the buffer, its length, and the original elements), which is not reasonable. Conversely, it is not difficult to move qemu_iovec_init_extended()'s functionality into bdrv_create_padded_qiov() by using public qemu_iovec_* functions, so that is what this patch does. Because bdrv_pad_request() was the only "serious" user of qemu_iovec_init_extended(), the next patch will remove the latter function, so the functionality is not implemented twice. Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2141964 Signed-off-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230411173418.19549-3-hreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>	2023-06-05 13:11:06 +02:00
Eric Blake	bd1386cce1	cutils: Adjust signature of parse_uint[_full] It's already confusing that we have two very similar functions for wrapping the parse of a 64-bit unsigned value, differing mainly on whether they permit leading '-'. Adjust the signature of parse_uint() and parse_uint_full() to be like all of qemu_strto(): put the result parameter last, use the same types (uint64_t and unsigned long long have the same width, but are not always the same type), and mark endptr const (this latter change only affects the rare caller of parse_uint). Adjust all callers in the tree. While at it, note that since cutils.c already includes: QEMU_BUILD_BUG_ON(sizeof(int64_t) != sizeof(long long)); we are guaranteed that the result of parse_uint cannot exceed UINT64_MAX (or the build would have failed), so we can drop pre-existing dead comparisons in opts-visitor.c that were never false. Reviewed-by: Hanna Czenczek <hreitz@redhat.com> Message-Id: <20230522190441.64278-8-eblake@redhat.com> [eblake: Drop dead code spotted by Markus] Signed-off-by: Eric Blake <eblake@redhat.com>	2023-06-02 12:27:19 -05:00
Stefano Garzarella	cad2ccc395	block/blkio: use qemu_open() to support fd passing for virtio-blk Some virtio-blk drivers (e.g. virtio-blk-vhost-vdpa) supports the fd passing. Let's expose this to the user, so the management layer can pass the file descriptor of an already opened path. If the libblkio virtio-blk driver supports fd passing, let's always use qemu_open() to open the `path`, so we can handle fd passing from the management layer through the "/dev/fdset/N" special path. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20230530071941.8954-2-sgarzare@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-06-01 11:08:21 -04:00
Stefan Hajnoczi	2a0d7cb6b7	block: remove bdrv_co_io_plug() API No block driver implements .bdrv_co_io_plug() anymore. Get rid of the function pointers. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20230530180959.1108766-7-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-06-01 08:59:24 -04:00
Stefan Hajnoczi	076682885d	block/linux-aio: convert to blk_io_plug_call() API Stop using the .bdrv_co_io_plug() API because it is not multi-queue block layer friendly. Use the new blk_io_plug_call() API to batch I/O submission instead. Note that a dev_max_batch check is dropped in laio_io_unplug() because the semantics of unplug_fn() are different from .bdrv_co_unplug(): 1. unplug_fn() is only called when the last blk_io_unplug() call occurs, not every time blk_io_unplug() is called. 2. unplug_fn() is per-thread, not per-BlockDriverState, so there is no way to get per-BlockDriverState fields like dev_max_batch. Therefore this condition cannot be moved to laio_unplug_fn(). It is not obvious that this condition affects performance in practice, so I am removing it instead of trying to come up with a more complex mechanism to preserve the condition. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20230530180959.1108766-6-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-06-01 07:34:03 -04:00
Stefan Hajnoczi	6a6da231b7	block/io_uring: convert to blk_io_plug_call() API Stop using the .bdrv_co_io_plug() API because it is not multi-queue block layer friendly. Use the new blk_io_plug_call() API to batch I/O submission instead. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20230530180959.1108766-5-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-06-01 07:34:03 -04:00
Stefan Hajnoczi	28ff7b4dfb	block/blkio: convert to blk_io_plug_call() API Stop using the .bdrv_co_io_plug() API because it is not multi-queue block layer friendly. Use the new blk_io_plug_call() API to batch I/O submission instead. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20230530180959.1108766-4-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-06-01 07:34:03 -04:00
Stefan Hajnoczi	f2e590002b	block/nvme: convert to blk_io_plug_call() API Stop using the .bdrv_co_io_plug() API because it is not multi-queue block layer friendly. Use the new blk_io_plug_call() API to batch I/O submission instead. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Kevin Wolf <kwolf@redhat.com> Message-id: 20230530180959.1108766-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2023-06-01 07:34:03 -04:00

1 2 3 4 5 ...

6044 Commits