reader_count() is a performance bottleneck because the global
aio_context_list_lock mutex causes thread contention. Put this debugging
assertion behind a new ./configure --enable-debug-graph-lock option and
disable it by default.
The --enable-debug-graph-lock option is also enabled by the more general
--enable-debug option.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230501173443.153062-1-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_refresh_limits() need to hold a reader lock for the graph because
it accesses the children list of a node.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-21-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_recurse_can_replace() need to hold a reader lock for the graph
because it accesses the children list of a node.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-20-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_query_bds_stats() need to hold a reader lock for the graph because
it accesses the children list of a node.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-18-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of amend
callbacks in BlockDriver need to hold a reader lock for the graph.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-17-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_co_get_info() need to hold a reader lock for the graph.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-15-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
bdrv_co_get_allocated_file_size() need to hold a reader lock for the
graph.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20230504115750.54437-14-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that functions accessing
the parent list of a node need to hold a reader lock for the graph. As
it happens, they already do.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230504115750.54437-13-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that functions accessing
the parent list of a node need to hold a reader lock for the graph. As
it happens, they already do.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-12-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds GRAPH_RDLOCK annotations to declare that callers of
nbd_co_do_establish_connection() need to hold a reader lock for the
graph.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-11-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The only thing nbd_co_flush() does is call nbd_client_co_flush(). Just
use that function directly in the BlockDriver definitions and remove the
wrapper.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-10-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Drivers were a bit confused about whether .bdrv_open can run in a
coroutine and whether or not it holds a graph lock.
It cannot keep a graph lock from the caller across the whole function
because it both changes the graph (requires a writer lock) and does I/O
(requires a reader lock). Therefore, it should take these locks
internally as needed.
The functions used to be called in coroutine context during image
creation. This was buggy for other reasons, and as of commit 32192301,
all block drivers go through no_co_wrappers. So it is not called in
coroutine context any more.
Fix qcow2 and qed to work with the correct assumptions: The graph lock
needs to be taken internally instead of just assuming it's already
there, and the coroutine path is dead code that can be removed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-9-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These functions must not be called in coroutine context, because they
need write access to the graph.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Migration code can call bdrv_activate() in coroutine context, whereas
other callers call it outside of coroutines. As it calls other code that
is not supposed to run in coroutines, standardise on running outside of
coroutines.
This adds a no_co_wrapper to switch to the main loop before calling
bdrv_activate().
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is a bdrv_co_getlength() now, which should be used in coroutine
context.
This requires adding GRAPH_RDLOCK to some functions so that this still
compiles with TSA because bdrv_co_getlength() is GRAPH_RDLOCK.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230504115750.54437-2-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
After the recent introduction of many new coroutine callbacks,
a couple calls from non-coroutine_fn to coroutine_fn have sneaked
in; fix them.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230406101752.242125-1-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Let's add --enable / --disable configure options for these formats,
so that those who don't need them may not build them.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230421092758.814122-1-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Most export types install BlockDeviceOps pointers. It is easy to forget
to remove them because that happens automatically via the "drive" qdev
property in hw/ but not block/export/.
Put blk_set_dev_ops(blk, NULL, NULL) calls in the core export.c code so
the export types don't need to remember.
This fixes the nbd and vhost-user-blk export types.
Fixes: fd6afc501a ("nbd/server: Use drained block ops to quiesce the server")
Fixes: ca858a5fe9 ("vhost-user-blk-server: notify client about disk resize")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20230502211119.720647-1-stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
- Protect BlockBackend.queued_requests with its own lock
- Switch to AIO_WAIT_WHILE_UNLOCKED() where possible
- AioContext removal: LinuxAioState/LuringState/ThreadPool
- Add more coroutine_fn annotations, use bdrv/blk_co_*
- Fix crash when execute hmp_commit
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmRH0b0RHGt3b2xmQHJl
ZGhhdC5jb20ACgkQfwmycsiPL9Y0yw/6A/vzA4TGgFUP3WIvH/sQri4/V3gyR+PT
u3hOQUCYZ99nioTpKV91TSuUPuU/Mdspy/0NKM+K92yIXqxa9172A2zLOsGOu21l
qKpse+nBf1zqEgB8YzUHyCBdetPz916C/f9RS26SNUCW85GCHYGHA3u7nKvWLMyV
oKIoTlA8QOglOuEKlRoYh7hCFm7ET51NOSEftm8GsYbsW/I2Vzl8a1SHN1lHufjd
We3+898zUrmFqNMp6Rjdhn+yZmmoGzoZqV4YQi83z7xjiv+Ms4VHVVW7X8d20xRX
5BLFiLHAuZ/1d26HyVhgBUr7KHyf94odocz8BylWKXGl5SXMCZun1Td1vgVKlGK+
GRxzB2cWGWqzC2UmqSTc0Z0aIWbXukKwvcX76uBKsQZ+kB2A7jFobxHiaoQEDJ8B
WRNEMH2+CqCAu9rsrNRinnJKhT2nXcr9F9YfwRIlagdAePGWin+EUW8huf14dDBm
Z2Y34aKW4RQibF8xirMHeRBbOLmcq2VpKLKwNfBHUDgZB8iuD7bLn4n9nwWXMG1w
zgNsTybkv46vLPamTpEaUoNTHfuRDTAuE7Z7lkcc7jF41Z0V1DC/DCCWcL/0LvhP
GIxFdkYug3hetdF2U/OZhUoEfxvkqcuBnrr55LFzqheKEllQpPwPpt7UF0aH8bg3
i/YpjHsf3xU=
=mpYX
-----END PGP SIGNATURE-----
Merge tag 'for-upstream' of https://repo.or.cz/qemu/kevin into staging
Block layer patches
- Protect BlockBackend.queued_requests with its own lock
- Switch to AIO_WAIT_WHILE_UNLOCKED() where possible
- AioContext removal: LinuxAioState/LuringState/ThreadPool
- Add more coroutine_fn annotations, use bdrv/blk_co_*
- Fix crash when execute hmp_commit
# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmRH0b0RHGt3b2xmQHJl
# ZGhhdC5jb20ACgkQfwmycsiPL9Y0yw/6A/vzA4TGgFUP3WIvH/sQri4/V3gyR+PT
# u3hOQUCYZ99nioTpKV91TSuUPuU/Mdspy/0NKM+K92yIXqxa9172A2zLOsGOu21l
# qKpse+nBf1zqEgB8YzUHyCBdetPz916C/f9RS26SNUCW85GCHYGHA3u7nKvWLMyV
# oKIoTlA8QOglOuEKlRoYh7hCFm7ET51NOSEftm8GsYbsW/I2Vzl8a1SHN1lHufjd
# We3+898zUrmFqNMp6Rjdhn+yZmmoGzoZqV4YQi83z7xjiv+Ms4VHVVW7X8d20xRX
# 5BLFiLHAuZ/1d26HyVhgBUr7KHyf94odocz8BylWKXGl5SXMCZun1Td1vgVKlGK+
# GRxzB2cWGWqzC2UmqSTc0Z0aIWbXukKwvcX76uBKsQZ+kB2A7jFobxHiaoQEDJ8B
# WRNEMH2+CqCAu9rsrNRinnJKhT2nXcr9F9YfwRIlagdAePGWin+EUW8huf14dDBm
# Z2Y34aKW4RQibF8xirMHeRBbOLmcq2VpKLKwNfBHUDgZB8iuD7bLn4n9nwWXMG1w
# zgNsTybkv46vLPamTpEaUoNTHfuRDTAuE7Z7lkcc7jF41Z0V1DC/DCCWcL/0LvhP
# GIxFdkYug3hetdF2U/OZhUoEfxvkqcuBnrr55LFzqheKEllQpPwPpt7UF0aH8bg3
# i/YpjHsf3xU=
# =mpYX
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 25 Apr 2023 02:12:29 PM BST
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
* tag 'for-upstream' of https://repo.or.cz/qemu/kevin: (25 commits)
block/monitor: Fix crash when executing HMP commit
vmdk: make vmdk_is_cid_valid a coroutine_fn
qcow2: mark various functions as coroutine_fn and GRAPH_RDLOCK
tests: mark more coroutine_fns
qemu-pr-helper: mark more coroutine_fns
9pfs: mark more coroutine_fns
nbd: mark more coroutine_fns, do not use co_wrappers
mirror: make mirror_flush a coroutine_fn, do not use co_wrappers
blkdebug: add missing coroutine_fn annotation
vvfat: mark various functions as coroutine_fn
thread-pool: avoid passing the pool parameter every time
thread-pool: use ThreadPool from the running thread
io_uring: use LuringState from the running thread
linux-aio: use LinuxAioState from the running thread
block: add missing coroutine_fn to bdrv_sum_allocated_file_size()
include/block: fixup typos
monitor: convert monitor_cleanup() to AIO_WAIT_WHILE_UNLOCKED()
hmp: convert handle_hmp_command() to AIO_WAIT_WHILE_UNLOCKED()
block: convert bdrv_drain_all_begin() to AIO_WAIT_WHILE_UNLOCKED()
block: convert bdrv_graph_wrlock() to AIO_WAIT_WHILE_UNLOCKED()
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
hmp_commit() calls blk_is_available() from a non-coroutine context (and
in the main loop). blk_is_available() is a co_wrapper_mixed_bdrv_rdlock
function, and in the non-coroutine context it calls AIO_WAIT_WHILE(),
which crashes if the aio_context lock is not taken before.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1615
Signed-off-by: Wang Liang <wangliangzz@inspur.com>
Message-Id: <20230424103902.45265-1-wangliangzz@126.com>
Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Functions that can do I/O are prime candidates for being coroutine_fns. Make the
change for the one that is itself called only from coroutine_fns. Unfortunately
vmdk does not use a coroutine_fn for the bulk of the open (like qcow2 does) so
vmdk_read_cid cannot have the same treatment.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230309084456.304669-10-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Functions that can do I/O (including calling bdrv_is_allocated
and bdrv_block_status functions) are prime candidates for being
coroutine_fns. Make the change for those that are themselves called
only from coroutine_fns. Also annotate that they are called with the
graph rdlock taken, thus allowing them to call bdrv_co_*() functions
for I/O.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230309084456.304669-9-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
mirror_flush calls a mixed function blk_flush but it is only called
from mirror_run; so call the coroutine version and make mirror_flush
a coroutine_fn too.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230309084456.304669-4-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230309084456.304669-3-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Functions that can do I/O are prime candidates for being coroutine_fns. Make the
change for those that are themselves called only from coroutine_fns.
In addition, coroutine_fns should do I/O using bdrv_co_*() functions, for
which it is required to hold the BlockDriverState graph lock. So also nnotate
functions on the I/O path with TSA attributes, making it possible to
switch them to use bdrv_co_*() functions.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230309084456.304669-2-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
thread_pool_submit_aio() is always called on a pool taken from
qemu_get_current_aio_context(), and that is the only intended
use: each pool runs only in the same thread that is submitting
work to it, it can't run anywhere else.
Therefore simplify the thread_pool_submit* API and remove the
ThreadPool function parameter.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20230203131731.851116-5-eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Use qemu_get_current_aio_context() where possible, since we always
submit work to the current thread anyways.
We want to also be sure that the thread submitting the work is
the same as the one processing the pool, to avoid adding
synchronization to the pool list.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20230203131731.851116-4-eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Remove usage of aio_context_acquire by always submitting asynchronous
AIO to the current thread's LuringState.
In order to prevent mistakes from the caller side, avoid passing LuringState
in luring_io_{plug/unplug} and luring_co_submit, and document the functions
to make clear that they work in the current thread's AioContext.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20230203131731.851116-3-eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Remove usage of aio_context_acquire by always submitting asynchronous
AIO to the current thread's LinuxAioState.
In order to prevent mistakes from the caller side, avoid passing LinuxAioState
in laio_io_{plug/unplug} and laio_co_submit, and document the functions
to make clear that they work in the current thread's AioContext.
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20230203131731.851116-2-eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Since the AioContext argument was already NULL, AIO_WAIT_WHILE() was
never going to unlock the AioContext. Therefore it is possible to
replace AIO_WAIT_WHILE() with AIO_WAIT_WHILE_UNLOCKED().
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230309190855.414275-5-stefanha@redhat.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The following conversion is safe and does not change behavior:
GLOBAL_STATE_CODE();
...
- AIO_WAIT_WHILE(qemu_get_aio_context(), ...);
+ AIO_WAIT_WHILE_UNLOCKED(NULL, ...);
Since we're in GLOBAL_STATE_CODE(), qemu_get_aio_context() is our home
thread's AioContext. Thus AIO_WAIT_WHILE() does not unlock the
AioContext:
if (ctx_ && in_aio_context_home_thread(ctx_)) { \
while ((cond)) { \
aio_poll(ctx_, true); \
waited_ = true; \
} \
And that means AIO_WAIT_WHILE_UNLOCKED(NULL, ...) can be substituted.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230309190855.414275-4-stefanha@redhat.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is no change in behavior. Switch to AIO_WAIT_WHILE_UNLOCKED()
instead of AIO_WAIT_WHILE() to document that this code has already been
audited and converted. The AioContext argument is already NULL so
aio_context_release() is never called anyway.
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230309190855.414275-3-stefanha@redhat.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is no need for the AioContext lock in bdrv_drain_all() because
nothing in AIO_WAIT_WHILE() needs the lock and the condition is atomic.
AIO_WAIT_WHILE_UNLOCKED() has no use for the AioContext parameter other
than performing a check that is nowadays already done by the
GLOBAL_STATE_CODE()/IO_CODE() macros. Set the ctx argument to NULL here
to help us keep track of all converted callers. Eventually all callers
will have been converted and then the argument can be dropped entirely.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230309190855.414275-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wilfred Mallawa <wilfred.mallawa@wdc.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The CoQueue API offers thread-safety via the lock argument that
qemu_co_queue_wait() and qemu_co_enter_next() take. BlockBackend
currently does not make use of the lock argument. This means that
multiple threads submitting I/O requests can corrupt the CoQueue's
QSIMPLEQ.
Add a QemuMutex and pass it to CoQueue APIs so that the queue is
protected. While we're at it, also assert that the queue is empty when
the BlockBackend is deleted.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230307210427.269214-4-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This field is accessed by multiple threads without a lock. Use explicit
qatomic_read()/qatomic_set() calls. There is no need for acquire/release
because blk_set_disable_request_queuing() doesn't provide any
guarantees (it helps that it's used at BlockBackend creation time and
not when there is I/O in flight).
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Hanna Czenczek <hreitz@redhat.com>
Message-Id: <20230307210427.269214-3-stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The main loop thread increments/decrements BlockBackend->quiesce_counter
when drained sections begin/end. The counter is read in the I/O code
path. Therefore this field is used to communicate between threads
without a lock.
Acquire/release are not necessary because the BlockBackend->in_flight
counter already uses sequentially consistent accesses and running I/O
requests hold that counter when blk_wait_while_drained() is called.
qatomic_read() can be used.
Use qatomic_fetch_inc()/qatomic_fetch_dec() for modifications even
though sequentially consistent atomic accesses are not strictly required
here. They are, however, nicer to read than multiple calls to
qatomic_read() and qatomic_set(). Since beginning and ending drain is
not a hot path the extra cost doesn't matter.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230307210427.269214-2-stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Mostly just fixes, cleanups all over the place.
Some optimizations.
More control over slot_reserved_mask.
More feature bits supported for SVQ.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmRHQvAPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpQc0H/RD+RXy7IAnmhkdCyjj0hM8pftPTwCJfrSCW
DLHP4c5jiKO5ngUoAv3YJdM77TBCXlJn6gceeKBrzhGUTtJ7dTLC+Udeq/jW43EF
/E2ldLLbTNFyUqW8yX7D+EVio7Jy4zXTHpczKCF5vO7MaVWS/b3QdCpmjXpEHLNb
janv24vQHHgmRwK96uIdIauJJT8aqYW0arn1po8anxuFS8ok9Tf8LTEF5uBHokJP
MriTwMaqMgRK+4rzh+b6wc7QC5GqIr44gFrsfFYuNOUY0+BizvGvUAtMt+B/XZwt
OF4RSShUh2bhsQoYwgvShfEsR/vWwOl3yMAhcsB+wMgMzMG8MUQ=
=e8DF
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: fixes, features, cleanups
Mostly just fixes, cleanups all over the place.
Some optimizations.
More control over slot_reserved_mask.
More feature bits supported for SVQ.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmRHQvAPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpQc0H/RD+RXy7IAnmhkdCyjj0hM8pftPTwCJfrSCW
# DLHP4c5jiKO5ngUoAv3YJdM77TBCXlJn6gceeKBrzhGUTtJ7dTLC+Udeq/jW43EF
# /E2ldLLbTNFyUqW8yX7D+EVio7Jy4zXTHpczKCF5vO7MaVWS/b3QdCpmjXpEHLNb
# janv24vQHHgmRwK96uIdIauJJT8aqYW0arn1po8anxuFS8ok9Tf8LTEF5uBHokJP
# MriTwMaqMgRK+4rzh+b6wc7QC5GqIr44gFrsfFYuNOUY0+BizvGvUAtMt+B/XZwt
# OF4RSShUh2bhsQoYwgvShfEsR/vWwOl3yMAhcsB+wMgMzMG8MUQ=
# =e8DF
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 25 Apr 2023 04:03:12 AM BST
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (31 commits)
hw/pci-bridge: Make PCIe and CXL PXB Devices inherit from TYPE_PXB_DEV
hw/pci-bridge: pci_expander_bridge fix type in pxb_cxl_dev_reset()
docs/specs: Convert pci-testdev.txt to rst
docs/specs: Convert pci-serial.txt to rst
docs/specs/pci-ids: Convert from txt to rST
acpi: pcihp: allow repeating hot-unplug requests
virtio: i2c: Check notifier helpers for VIRTIO_CONFIG_IRQ_IDX
docs: Remove obsolete descriptions of SR-IOV support
intel_iommu: refine iotlb hash calculation
docs/cxl: Fix sentence
MAINTAINERS: Add Eugenio Pérez as vhost-shadow-virtqueue reviewer
tests: bios-tables-test: replace memset with initializer
hw/acpi: limit warning on acpi table size to pc machines older than version 2.3
Add my old and new work email mapping and use work email to support acpi
vhost-user-blk-server: notify client about disk resize
pci: avoid accessing slot_reserved_mask directly outside of pci.c
hw: Add compat machines for 8.1
hw/i386/amd_iommu: Factor amdvi_pci_realize out of amdvi_sysbus_realize
hw/i386/amd_iommu: Set PCI static/const fields via PCIDeviceClass
hw/i386/amd_iommu: Move capab_offset from AMDVIState to AMDVIPCIState
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Introduce the BdrvDmgUncompressFunc type defintion. To emphasis
dmg_uncompress_bz2 and dmg_uncompress_lzfse are pointer to functions,
declare them using this new typedef.
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-id: 20230320152610.32052-1-philmd@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently block_resize qmp command is simply ignored by vhost-user-blk
export. So, the block-node is successfully resized, but virtio config
is unchanged and guest doesn't see that disk is resized.
Let's handle the resize by modifying the config and notifying the guest
appropriately.
After this comment, lsblk in linux guest with attached
vhost-user-blk-pci device shows new size immediately after block_resize
QMP command on vhost-user exported block node.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20230321201323.3695923-1-vsementsov@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There is already a barrier in AIO_WAIT_WHILE_INTERNAL(), thus the
qatomic_mb_read() is not adding anything.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the former nfs_get_allocated_file_size is now a coroutine
function, it must suspend rather than poll. Switch BDRV_POLL_WHILE()
to a qemu_coroutine_yield() loop and schedule nfs_co_generic_bh_cb()
in place of the call to bdrv_wakeup().
Fixes: 82618d7bc3 ("block: Convert bdrv_get_allocated_file_size() to co_wrapper", 2023-02-01)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230412112606.80983-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The introduction of the graph lock is causing blk_get_geometry, a hot function
used in the I/O path, to create a coroutine. However, the only part that really
needs to run in coroutine context is the call to bdrv_co_refresh_total_sectors,
which in turn only happens in the rare case of host CD-ROM devices.
So, write by hand the three wrappers on the path from blk_co_get_geometry to
bdrv_co_refresh_total_sectors, so that the coroutine wrapper is only created
if bdrv_nb_sectors actually calls bdrv_refresh_total_sectors.
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230407153303.391121-9-pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All callers of blk_co_nb_sectors (and blk_nb_sectors) are able to
handle a non-inserted CD-ROM as a zero-length file, they do not need
to raise an error.
Not using blk_co_is_available() aligns the function with
blk_co_get_geometry(), which becomes a simple wrapper for
blk_co_nb_sectors(). It will also make it possible to skip the creation
of a coroutine in the (common) case where bs->bl.has_variable_length
is false.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230407153303.391121-8-pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_co_get_geometry is only used in blk_co_get_geometry. Inline it in
there, to reduce the number of wrappers for bs->total_sectors.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230407153303.391121-7-pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fill in the field in BlockLimits directly for host devices, and
copy it from there for the raw format.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230407153303.391121-5-pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Filters automatically get has_variable_length from their underlying
BlockDriverState. There is no need to mark them as variable-length
in the BlockDriver.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230407153303.391121-3-pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
At the protocol level, has_variable_length only needs to be true in the
very special case of host CD-ROM drives, so that they do not need an
explicit monitor command to read the new size when a disc is loaded
in the tray.
However, at the format level has_variable_length has to be true for all
raw blockdevs and for all filters, even though in practice the length
depends on the underlying file and thus will not change except in the
case of host CD-ROM drives.
As a first step towards computing an accurate value of has_variable_length,
add the value into the BlockLimits structure and initialize the field
from the BlockDriver.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20230407153303.391121-2-pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The corruption occurs when a BAT entry aligned to 4096 bytes is changed.
Specifically, the corruption occurs during the creation of the LOG Data
Descriptor. The incorrect behavior involves copying 4088 bytes from the
original 4096 bytes aligned offset to `tmp[8..4096]` and then copying
the new value for the first BAT entry to the beginning `tmp[0..8]`.
This results in all existing BAT entries inside the 4K region being
incorrectly moved by 8 bytes and the last entry being lost.
This bug did not cause noticeable corruption when only sequentially
writing once to an empty dynamic VHDX (e.g.
using `qemu-img convert -O vhdx -o subformat=dynamic ...`), but it
still resulted in invalid values for the (unused) Sector Bitmap BAT
entries.
Importantly, this corruption would only become noticeable after the
corrupted BAT is re-read from the file.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/727
Cc: qemu-stable@nongnu.org
Signed-off-by: Lukas Tschoke <lukts330@gmail.com>
Message-Id: <6cfb6d6b-adc5-7772-c8a5-6bae9a0ad668@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When liblzfe (Apple LZFSE compression library) is present
(for example installed via 'brew') on Darwin, QEMU build
fails as:
Has header "lzfse.h" : YES
Library lzfse found: YES
Dependencies
lzo support : NO
snappy support : NO
bzip2 support : YES
lzfse support : YES
zstd support : YES 1.5.2
User defined options
dmg : enabled
lzfse : enabled
[221/903] Compiling C object libblock.fa.p/block_dmg-lzfse.c.o
FAILED: libblock.fa.p/block_dmg-lzfse.c.o
/opt/homebrew/Cellar/lzfse/1.0/include/lzfse.h:56:43: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
LZFSE_API size_t lzfse_encode_scratch_size();
^
void
/opt/homebrew/Cellar/lzfse/1.0/include/lzfse.h:94:43: error: this function declaration is not a prototype [-Werror,-Wstrict-prototypes]
LZFSE_API size_t lzfse_decode_scratch_size();
^
void
2 errors generated.
ninja: build stopped: subcommand failed.
This issue has been reported in the lzfse project in 2016:
https://github.com/lzfse/lzfse/issues/3#issuecomment-226574719
Since the project seems unmaintained, simply ignore the
strict-prototypes warning check for the <lzfse.h> header,
similarly to how we deal with the GtkItemFactoryCallback
prototype from <gtk/gtkitemfactory.h>, indirectly included
by <gtk/gtk.h>.
Cc: Julio Faracco <jcfaracco@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20230327151349.97572-1-philmd@linaro.org>