qemu-e2k

Author	SHA1	Message	Date
Emanuele Giuseppe Esposito	1581a70ddd	block/coroutines: I/O and "I/O or GS" API block coroutines functions run in different aiocontext, and are not protected by the BQL. Therefore are I/O. On the other side, generated_co_wrapper functions use BDRV_POLL_WHILE, meaning the caller can either be the main loop or a specific iothread. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-25-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito	c5be7445b7	assertions for blockdev.h global state API Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-22-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito	bdb734763b	block.c: add assertions to static functions Following the assertion derived from the API split, propagate the assertion also in the static functions. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-18-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito	967d7905d1	IO_CODE and IO_OR_GS_CODE for block_int I/O API Mark all I/O functions with IO_CODE, and all "I/O OR GS" with IO_OR_GS_CODE. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-14-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito	b4ad82aab1	assertions for block_int global state API Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-13-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito	37868b2ac6	IO_CODE and IO_OR_GS_CODE for block-backend I/O API Mark all I/O functions with IO_CODE, and all "I/O OR GS" with IO_OR_GS_CODE. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-10-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito	0439c5a462	block/block-backend.c: assertions for block-backend All the global state (GS) API functions will check that qemu_in_main_thread() returns true. If not, it means that the safety of BQL cannot be guaranteed, and they need to be moved to I/O. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-9-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito	a2c4c3b19b	include/sysemu/block-backend: split header into I/O and global state (GS) API Similarly to the previous patches, split block-backend.h in block-backend-io.h and block-backend-global-state.h In addition, remove "block/block.h" include as it seems it is not necessary anymore, together with "qemu/iov.h" block-backend-common.h contains the structures shared between the two headers, and the functions that can't be categorized as I/O or global state. Assertions are added in the next patch. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220303151616.325444-8-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito	3b71719462	block: rename bdrv_invalidate_cache_all, blk_invalidate_cache and test_sync_op_invalidate_cache Following the bdrv_activate renaming, change also the name of the respective callers. bdrv_invalidate_cache_all -> bdrv_activate_all blk_invalidate_cache -> blk_activate test_sync_op_invalidate_cache -> test_sync_op_activate No functional change intended. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220209105452.1694545-5-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-03-04 18:14:40 +01:00
Emanuele Giuseppe Esposito	a94750d956	block: introduce bdrv_activate This function is currently just a wrapper for bdrv_invalidate_cache(), but in future will contain the code of bdrv_co_invalidate_cache() that has to always be protected by BQL, and leave the rest in the I/O coroutine. Replace all bdrv_invalidate_cache() invokations with bdrv_activate(). Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Reviewed-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20220209105452.1694545-4-eesposit@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-03-04 18:14:40 +01:00
Hanna Reitz	492a119610	block-backend: Retain permissions after migration After migration, the permissions the guest device wants to impose on its BlockBackend are stored in blk->perm and blk->shared_perm. In blk_root_activate(), we take our permissions, but keep all shared permissions open by calling `blk_set_perm(blk->perm, BLK_PERM_ALL)`. Only afterwards (immediately or later, depending on the runstate) do we restrict the shared permissions by calling `blk_set_perm(blk->perm, blk->shared_perm)`. Unfortunately, our first call with shared_perm=BLK_PERM_ALL has overwritten blk->shared_perm to be BLK_PERM_ALL, so this is a no-op and the set of shared permissions is not restricted. Fix this bug by saving the set of shared permissions before invoking blk_set_perm() with BLK_PERM_ALL and restoring it afterwards. Fixes: `5f7772c4d0` ("block-backend: Defer shared_perm tightening migration completion") Reported-by: Peng Liang <liangpeng10@huawei.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20211125135317.186576-2-hreitz@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Peng Liang <liangpeng10@huawei.com>	2022-02-01 10:51:39 +01:00
Stefan Hajnoczi	1e3552dbd2	block-backend: prevent dangling BDS pointers across aio_poll() The BlockBackend root child can change when aio_poll() is invoked. This happens when a temporary filter node is removed upon blockjob completion, for example. Functions in block/block-backend.c must be aware of this when using a blk_bs() pointer across aio_poll() because the BlockDriverState refcnt may reach 0, resulting in a stale pointer. One example is scsi_device_purge_requests(), which calls blk_drain() to wait for in-flight requests to cancel. If the backup blockjob is active, then the BlockBackend root child is a temporary filter BDS owned by the blockjob. The blockjob can complete during bdrv_drained_begin() and the last reference to the BDS is released when the temporary filter node is removed. This results in a use-after-free when blk_drain() calls bdrv_drained_end(bs) on the dangling pointer. Explicitly hold a reference to bs across block APIs that invoke aio_poll(). Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2021778 Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2036178 Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20220111153613.25453-2-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2022-01-14 12:03:16 +01:00
Hanna Reitz	73d4a11300	block-backend: Silence clang -m32 compiler warning Similarly to `e7e588d432`, there is a warning in block/block-backend.c that qiov->size <= INT64_MAX is always true on machines where size_t is narrower than a uint64_t. In said commit, we silenced this warning by casting to uint64_t. The commit introducing this warning here (`a93d81c84a`) anticipated it and so tried to address it the same way. However, it only did so in one of two places where this comparison occurs, and so we still need to fix up the other one. Fixes: `a93d81c84a` ("block-backend: convert blk_aio_ functions to int64_t bytes paramter") Signed-off-by: Hanna Reitz <hreitz@redhat.com> Message-Id: <20211026090745.30800-1-hreitz@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-11-02 13:22:09 +01:00
Vladimir Sementsov-Ogievskiy	aa78b82516	block-backend: drop INT_MAX restriction from blk_check_byte_request() blk_check_bytes_request is called from blk_co_do_preadv, blk_co_do_pwritev_part, blk_co_do_pdiscard and blk_co_copy_range before (maybe) calling throttle_group_co_io_limits_intercept() (which has int64_t argument) and then calling corresponding bdrv_co_ function. bdrv_co_ functions are OK with int64_t bytes as well. So dropping the check for INT_MAX we just get same restrictions as in bdrv_ layer: discard and write-zeroes goes through bdrv_check_qiov_request() and are allowed to be 64bit. Other requests go through bdrv_check_request32() and still restricted by INT_MAX boundary. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006131718.214235-13-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-10-15 16:00:07 -05:00
Vladimir Sementsov-Ogievskiy	14149710f9	block-backend: blk_pread, blk_pwrite: rename count parameter to bytes To be consistent with declarations in include/sysemu/block-backend.h. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006131718.214235-12-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-10-15 15:59:26 -05:00
Vladimir Sementsov-Ogievskiy	a93d81c84a	block-backend: convert blk_aio_ functions to int64_t bytes paramter 1. Convert bytes in BlkAioEmAIOCB: aio->bytes is only passed to already int64_t interfaces, and set in blk_aio_prwv, which is updated here. 2. For all updated functions the parameter type becomes wider so callers are safe. 3. In blk_aio_prwv we only store bytes to BlkAioEmAIOCB, which is updated here. 4. Other updated functions are wrappers on blk_aio_prwv. Note that blk_aio_preadv and blk_aio_pwritev become safer: before this commit, it's theoretically possible to pass qiov with size exceeding INT_MAX, which than converted to int argument of blk_aio_prwv. Now it's converted to int64_t which is a lot better. Still add assertions. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006131718.214235-11-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: tweak assertion and grammar] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-10-15 15:57:29 -05:00
Vladimir Sementsov-Ogievskiy	e192179bb2	block-backend: convert blk_co_copy_range to int64_t bytes Function is updated so that parameter type becomes wider, so all callers should be OK with it. Look at blk_co_copy_range() itself: bytes is passed only to blk_check_byte_request() and bdrv_co_copy_range(), which already have int64_t bytes parameter, so we are OK. Note that requests exceeding INT_MAX are still restricted by blk_check_byte_request(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006131718.214235-10-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: grammar tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-10-15 15:55:28 -05:00
Vladimir Sementsov-Ogievskiy	06f0325c5b	block-backend: convert blk_foo wrappers to use int64_t bytes parameter Convert blk_pdiscard, blk_pwrite_compressed, blk_pwrite_zeroes. These are just wrappers for functions with int64_t argument, so allow passing int64_t as well. Parameter type becomes wider so all callers should be OK with it. Note that requests exceeding INT_MAX are still restricted by blk_check_byte_request(). Note also that we don't (and are not going to) convert blk_pwrite and blk_pread: these functions return number of bytes on success, so to update them, we should change return type to int64_t as well, which will lead to investigating and updating all callers which is too much. So, blk_pread and blk_pwrite remain unchanged. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006131718.214235-9-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: grammar tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-10-15 15:53:48 -05:00
Vladimir Sementsov-Ogievskiy	16d36e2996	block-backend: drop blk_prw, use block-coroutine-wrapper Let's drop hand-made coroutine wrappers and use coroutine wrapper generation like in block/io.c. Now, blk_foo() functions are written in same way as blk_co_foo() ones, but wrap blk_do_foo() instead of blk_co_do_foo(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006131718.214235-8-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: spelling fix] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-10-15 15:53:24 -05:00
Vladimir Sementsov-Ogievskiy	70e8775ed9	block-backend: rename _do_ helper functions to _co_do_ This is a preparation to the following commit, to use automatic coroutine wrapper generation. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006131718.214235-6-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-10-15 15:50:40 -05:00
Vladimir Sementsov-Ogievskiy	2800637a33	block-backend: convert blk_co_pdiscard to int64_t bytes We updated blk_do_pdiscard() and its wrapper blk_co_pdiscard(). Both functions are updated so that the parameter type becomes wider, so all callers should be OK with it. Look at blk_do_pdiscard(): bytes is passed only to blk_check_byte_request() and bdrv_co_pdiscard(), which already have int64_t bytes parameter, so we are OK. Note that requests exceeding INT_MAX are still restricted by blk_check_byte_request(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006131718.214235-5-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: grammar tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-10-15 15:48:56 -05:00
Vladimir Sementsov-Ogievskiy	34460feb63	block-backend: convert blk_co_pwritev_part to int64_t bytes We convert blk_do_pwritev_part() and some wrappers: blk_co_pwritev_part(), blk_co_pwritev(), blk_co_pwrite_zeroes(). All functions are converted so that the parameter type becomes wider, so all callers should be OK with it. Look at blk_do_pwritev_part() body: bytes is passed to: - trace_blk_co_pwritev (we update it here) - blk_check_byte_request, throttle_group_co_io_limits_intercept, bdrv_co_pwritev_part - all already have int64_t argument. Note that requests exceeding INT_MAX are still restricted by blk_check_byte_request(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006131718.214235-4-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: grammar tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-10-15 15:47:18 -05:00
Vladimir Sementsov-Ogievskiy	9547907705	block-backend: make blk_co_preadv() 64bit For both updated functions, the type of bytes becomes wider, so all callers should be OK with it. blk_co_preadv() only passes its arguments to blk_do_preadv(). blk_do_preadv() passes bytes to: - trace_blk_co_preadv, which is updated too - blk_check_byte_request, throttle_group_co_io_limits_intercept, bdrv_co_preadv, which are already int64_t. Note that requests exceeding INT_MAX are still restricted by blk_check_byte_request(). Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006131718.214235-3-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> [eblake: grammar tweaks] Signed-off-by: Eric Blake <eblake@redhat.com>	2021-10-15 15:46:44 -05:00
Vladimir Sementsov-Ogievskiy	7242db6389	block-backend: blk_check_byte_request(): int64_t bytes Rename size and make it int64_t to correspond to modern block layer, which always uses int64_t for offset and bytes (not in blk layer yet, which is a task for following commits). All callers pass int or unsigned int. So, for bytes in [0, INT_MAX] nothing is changed, for negative bytes we now fail on "bytes < 0" check instead of "bytes > INT_MAX" check. Note, that blk_check_byte_request() still doesn't allow requests exceeding INT_MAX. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20211006131718.214235-2-vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-10-15 15:39:40 -05:00
Paolo Bonzini	cc07162953	block: introduce max_hw_iov for use in scsi-generic Linux limits the size of iovecs to 1024 (UIO_MAXIOV in the kernel sources, IOV_MAX in POSIX). Because of this, on some host adapters requests with many iovecs are rejected with -EINVAL by the io_submit() or readv()/writev() system calls. In fact, the same limit applies to SG_IO as well. To fix both the EINVAL and the possible performance issues from using fewer iovecs than allowed by Linux (some HBAs have max_segments as low as 128), introduce a separate entry in BlockLimits to hold the max_segments value from sysfs. This new limit is used only for SG_IO and clamped to bs->bl.max_iov anyway, just like max_hw_transfer is clamped to bs->bl.max_transfer. Reported-by: Halil Pasic <pasic@linux.ibm.com> Cc: Hanna Reitz <hreitz@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: qemu-block@nongnu.org Cc: qemu-stable@nongnu.org Fixes: `18473467d5` ("file-posix: try BLKSECTGET on block devices too, do not round to power of 2", 2021-06-25) Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20210923130436.1187591-1-pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-10-06 10:25:55 +02:00
Vladimir Sementsov-Ogievskiy	ed089506ee	block: introduce blk_replace_bs Add function to change bs inside blk. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20210824083856.17408-3-vsementsov@virtuozzo.com> Signed-off-by: Hanna Reitz <hreitz@redhat.com>	2021-09-01 12:57:31 +02:00
Paolo Bonzini	24b36e9813	block: add max_hw_transfer to BlockLimits For block host devices, I/O can happen through either the kernel file descriptor I/O system calls (preadv/pwritev, io_submit, io_uring) or the SCSI passthrough ioctl SG_IO. In the latter case, the size of each transfer can be limited by the HBA, while for file descriptor I/O the kernel is able to split and merge I/O in smaller pieces as needed. Applying the HBA limits to file descriptor I/O results in more system calls and suboptimal performance, so this patch splits the max_transfer limit in two: max_transfer remains valid and is used in general, while max_hw_transfer is limited to the maximum hardware size. max_hw_transfer can then be included by the scsi-generic driver in the block limits page, to ensure that the stricter hardware limit is used. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-25 10:54:13 +02:00
Paolo Bonzini	b99f7fa08a	block-backend: align max_transfer to request alignment Block device requests must be aligned to bs->bl.request_alignment. It makes sense for drivers to align bs->bl.max_transfer the same way; however when there is no specified limit, blk_get_max_transfer just returns INT_MAX. Since the contract of the function does not specify that INT_MAX means "no maximum", just align the outcome of the function (whether INT_MAX or bs->bl.max_transfer) before returning it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-25 10:54:13 +02:00
Sergio Lopez	095cc4d0f6	block-backend: add drained_poll Allow block backends to poll their devices/users to check if they have been quiesced when entering a drained section. This will be used in the next patch to wait for the NBD server to be completely quiesced. Suggested-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Sergio Lopez <slp@redhat.com> Message-Id: <20210602060552.17433-2-slp@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-06-02 14:23:20 +02:00
Vladimir Sementsov-Ogievskiy	fd240a184b	block-backend: improve blk_root_get_parent_desc() We have different types of parents: block nodes, block backends and jobs. So, it makes sense to specify type together with name. While being here also use g_autofree. iotest 307 output is updated. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20210601075218.79249-3-vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-06-02 14:23:20 +02:00
Vladimir Sementsov-Ogievskiy	260242a833	block: drop BlockBackendRootState::read_only Instead of keeping additional boolean field, let's store the information in BDRV_O_RDWR bit of BlockBackendRootState::open_flags. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210527154056.70294-4-vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-06-02 14:23:20 +02:00
Vladimir Sementsov-Ogievskiy	307261b243	block: consistently use bdrv_is_read_only() It's better to use accessor function instead of bs->read_only directly. In some places use bdrv_is_writable() instead of checking both BDRV_O_RDWR set and BDRV_O_INACTIVE not set. In bdrv_open_common() it's a bit strange to add one more variable, but we are going to drop bs->read_only in the next patch, so new ro local variable substitutes it here. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20210527154056.70294-2-vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-06-02 14:23:20 +02:00
Thomas Huth	4c386f8064	Do not include sysemu/sysemu.h if it's not really necessary Stop including sysemu/sysemu.h in files that don't need it. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <20210416171314.2074665-2-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-05-02 17:24:50 +02:00
Kevin Wolf	35b7f4abd5	block: Add BDRV_O_NO_SHARE for blk_new_open() Normally, blk_new_open() just shares all permissions. This was fine originally when permissions only protected against uses in the same process because no other part of the code would actually get to access the block nodes opened with blk_new_open(). However, since we use it for file locking now, unsharing permissions becomes desirable. Add a new BDRV_O_NO_SHARE flag that is used in blk_new_open() to unshare any permissions that can be unshared. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20210422164344.283389-2-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy	228ca37e12	block: drop ctx argument from bdrv_root_attach_child Passing parent aio context is redundant, as child_class and parent opaque pointer are enough to retrieve it. Drop the argument and use new bdrv_child_get_parent_aio_context() interface. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20210428151804.439460-7-vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-04-30 12:27:47 +02:00
Vladimir Sementsov-Ogievskiy	3ca1f32257	block: BdrvChildClass: add .get_parent_aio_context handler Add new handler to get aio context and implement it in all child classes. Add corresponding public interface to be used soon. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20210428151804.439460-6-vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-04-30 12:27:47 +02:00
Philippe Mathieu-Daudé	538f049704	sysemu: Let VMChangeStateHandler take boolean 'running' argument The 'running' argument from VMChangeStateHandler does not require other value than 0 / 1. Make it a plain boolean. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <20210111152020.1422021-3-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-03-09 23:13:57 +01:00
Kevin Wolf	86b1cf3227	block: Separate blk_is_writable() and blk_supports_write_perm() Currently, blk_is_read_only() tells whether a given BlockBackend can only be used in read-only mode because its root node is read-only. Some callers actually try to answer a slightly different question: Is the BlockBackend configured to be writable, by taking write permissions on the root node? This can differ, for example, for CD-ROM devices which don't take write permissions, but may be backed by a writable image file. scsi-cd allows write requests to the drive if blk_is_read_only() returns false. However, the write request will immediately run into an assertion failure because the write permission is missing. This patch introduces separate functions for both questions. blk_supports_write_perm() answers the question whether the block node/image file can support writable devices, whereas blk_is_writable() tells whether the BlockBackend is currently configured to be writable. All calls of blk_is_read_only() are converted to one of the two new functions. Fixes: https://bugs.launchpad.net/bugs/1906693 Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20210118123448.307825-2-kwolf@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2021-01-27 20:45:20 +01:00
Stefan Hajnoczi	d73415a315	qemu/atomic.h: rename atomic_ to qatomic_ clang's C11 atomic_fetch_() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int ' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic$64$\?_[a-z0-9_]\+' include/qemu/atomic.h \| \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>	2020-09-23 16:07:44 +01:00
Max Reitz	9a71b9de3f	commit: Deal with filters This includes some permission limiting (for example, we only need to take the RESIZE permission if the base is smaller than the top). Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-07 12:31:31 +02:00
Stefan Hajnoczi	1d719ddc35	block: fix bdrv_aio_cancel() for ENOMEDIUM requests bdrv_aio_cancel() calls aio_poll() on the AioContext for the given I/O request until it has completed. ENOMEDIUM requests are special because there is no BlockDriverState when the drive has no medium! Define a .get_aio_context() function for BlkAioEmAIOCB requests so that bdrv_aio_cancel() can find the AioContext where the completion BH is pending. Without this function bdrv_aio_cancel() aborts on ENOMEDIUM requests! libFuzzer triggered the following assertion: cat << EOF \| qemu-system-i386 -M pc-q35-5.0 \ -nographic -monitor none -serial none \ -qtest stdio -trace ide\* outl 0xcf8 0x8000fa24 outl 0xcfc 0xe106c000 outl 0xcf8 0x8000fa04 outw 0xcfc 0x7 outl 0xcf8 0x8000fb20 write 0x0 0x3 0x2780e7 write 0xe106c22c 0xd 0x1130c218021130c218021130c2 write 0xe106c218 0x15 0x110010110010110010110010110010110010110010 EOF ide_exec_cmd IDE exec cmd: bus 0x56170a77a2b8; state 0x56170a77a340; cmd 0xe7 ide_reset IDEstate 0x56170a77a340 Aborted (core dumped) (gdb) bt #1 0x00007ffff4f93895 in abort () at /lib64/libc.so.6 #2 0x0000555555dc6c00 in bdrv_aio_cancel (acb=0x555556765550) at block/io.c:2745 #3 0x0000555555dac202 in blk_aio_cancel (acb=0x555556765550) at block/block-backend.c:1546 #4 0x0000555555b1bd74 in ide_reset (s=0x555557213340) at hw/ide/core.c:1318 #5 0x0000555555b1e3a1 in ide_bus_reset (bus=0x5555572132b8) at hw/ide/core.c:2422 #6 0x0000555555b2aa27 in ahci_reset_port (s=0x55555720eb50, port=2) at hw/ide/ahci.c:650 #7 0x0000555555b29fd7 in ahci_port_write (s=0x55555720eb50, port=2, offset=44, val=16) at hw/ide/ahci.c:360 #8 0x0000555555b2a564 in ahci_mem_write (opaque=0x55555720eb50, addr=556, val=16, size=1) at hw/ide/ahci.c:513 #9 0x000055555598415b in memory_region_write_accessor (mr=0x55555720eb80, addr=556, value=0x7fffffffb838, size=1, shift=0, mask=255, attrs=...) at softmmu/memory.c:483 Looking at bdrv_aio_cancel: 2728 /* async I/Os / 2729 2730 void bdrv_aio_cancel(BlockAIOCB acb) 2731 { 2732 qemu_aio_ref(acb); 2733 bdrv_aio_cancel_async(acb); 2734 while (acb->refcnt > 1) { 2735 if (acb->aiocb_info->get_aio_context) { 2736 aio_poll(acb->aiocb_info->get_aio_context(acb), true); 2737 } else if (acb->bs) { 2738 /* qemu_aio_ref and qemu_aio_unref are not thread-safe, so 2739 * assert that we're not using an I/O thread. Thread-safe 2740 * code should use bdrv_aio_cancel_async exclusively. 2741 */ 2742 assert(bdrv_get_aio_context(acb->bs) == qemu_get_aio_context()); 2743 aio_poll(bdrv_get_aio_context(acb->bs), true); 2744 } else { 2745 abort(); <=============== 2746 } 2747 } 2748 qemu_aio_unref(acb); 2749 } Fixes: `02c50efe08` ("block: Add bdrv_aio_cancel_async") Reported-by: Alexander Bulekov <alxndr@bu.edu> Buglink: https://bugs.launchpad.net/qemu/+bug/1878255 Originally-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20200720100141.129739-1-stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-07-21 12:00:38 +02:00
Greg Kurz	e6cada9231	block: Avoid stale pointer dereference in blk_get_aio_context() It is possible for blk_remove_bs() to race with blk_drain_all(), causing the latter to dereference a stale blk->root pointer: blk_remove_bs(blk) bdrv_root_unref_child(blk->root) child_bs = blk->root->bs bdrv_detach_child(blk->root) ... g_free(blk->root) <============== blk->root becomes stale bdrv_unref(child_bs) <============ yield at some point A blk_drain_all() can be triggered by some guest action in the meantime, eg. on POWER, SLOF might disable bus mastering on a virtio-scsi-pci device: virtio_write_config() virtio_pci_stop_ioeventfd() virtio_bus_stop_ioeventfd() virtio_scsi_dataplane_stop() blk_drain_all() blk_get_aio_context() bs = blk->root ? blk->root->bs : NULL ^^^^^^^^^ stale Then, depending on one's luck, QEMU either crashes with SEGV or hits the assertion in blk_get_aio_context(). blk->root is set by blk_insert_bs() which calls bdrv_root_attach_child() first. The blk_remove_bs() function should rollback the changes made by blk_insert_bs() in the opposite order (or it should be documented somewhere why this isn't the case). Clear blk->root before calling bdrv_root_unref_child() in blk_remove_bs(). Signed-off-by: Greg Kurz <groug@kaod.org> Message-Id: <159430264541.389456.11925072456012783045.stgit@bahia.lan> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:24:15 +02:00
Max Reitz	1f38f04eac	block: Pass BdrvChildRole in remaining cases These calls have no real use for the child role yet, but it will not harm to give one. Notably, the bdrv_root_attach_child() call in blockjob.c is left unmodified because there is not much the generic BlockJob object wants from its children. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200513110544.176672-34-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	3cdc69d31b	block: Pass parent_is_format to .inherit_options() We plan to unify the generic .inherit_options() functions. The resulting common function will need to decide whether to force-enable format probing, force-disable it, or leave it as-is. To make this decision, it will need to know whether the parent node is a format node or not (because we never want format probing if the parent is a format node already (except for the backing chain)). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-9-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	272c02eaef	block: Pass BdrvChildRole to .inherit_options() For now, all callers (effectively) pass 0 and no callee evaluates thie value. Later patches will change both. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-8-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	258b776515	block: Add BdrvChildRole to BdrvChild For now, it is always set to 0. Later patches in this series will ensure that all callers pass an appropriate combination of flags. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-6-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	bd86fb990c	block: Rename BdrvChildRole to BdrvChildClass This structure nearly only contains parent callbacks for child state changes. It cannot really reflect a child's role, because different roles may overlap (as we will see when real roles are introduced), and because parents can have custom callbacks even when the child fulfills a standard role. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20200513110544.176672-4-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	2b7bbdbdef	block: Add blk_make_empty() Two callers of BlockDriver.bdrv_make_empty() remain that should not call this method directly. Both do not have access to a BdrvChild, but they can use a BlockBackend, so we add this function that lets them use it. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200429141126.85159-4-mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Eric Blake	a3aeeab557	block: Add blk_new_with_bs() helper There are several callers that need to create a new block backend from an existing BDS; make the task slightly easier with a common helper routine. Suggested-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200424190903.522087-2-eblake@redhat.com> [mreitz: Set @ret only in error paths, see https://lists.nongnu.org/archive/html/qemu-block/2020-04/msg01216.html] Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200428192648.749066-2-eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-05-05 13:17:36 +02:00
Kevin Wolf	8c6242b6f3	block-backend: Add flags to blk_truncate() Now that node level interface bdrv_truncate() supports passing request flags to the block driver, expose this on the BlockBackend level, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200424125448.63318-4-kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-04-30 17:51:07 +02:00

1 2 3 4 5 ...

278 Commits