block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
/*
|
|
|
|
* Block tests for iothreads
|
|
|
|
*
|
|
|
|
* Copyright (c) 2018 Kevin Wolf <kwolf@redhat.com>
|
|
|
|
*
|
|
|
|
* Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
|
|
* of this software and associated documentation files (the "Software"), to deal
|
|
|
|
* in the Software without restriction, including without limitation the rights
|
|
|
|
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
|
|
* copies of the Software, and to permit persons to whom the Software is
|
|
|
|
* furnished to do so, subject to the following conditions:
|
|
|
|
*
|
|
|
|
* The above copyright notice and this permission notice shall be included in
|
|
|
|
* all copies or substantial portions of the Software.
|
|
|
|
*
|
|
|
|
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
|
|
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
|
|
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
|
|
|
|
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
|
|
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
|
|
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
|
|
|
|
* THE SOFTWARE.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#include "qemu/osdep.h"
|
|
|
|
#include "block/block.h"
|
|
|
|
#include "block/blockjob_int.h"
|
|
|
|
#include "sysemu/block-backend.h"
|
|
|
|
#include "qapi/error.h"
|
2019-05-06 19:18:00 +02:00
|
|
|
#include "qapi/qmp/qdict.h"
|
Include qemu/main-loop.h less
In my "build everything" tree, changing qemu/main-loop.h triggers a
recompile of some 5600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h). It includes block/aio.h,
which in turn includes qemu/event_notifier.h, qemu/notify.h,
qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h,
qemu/thread.h, qemu/timer.h, and a few more.
Include qemu/main-loop.h only where it's needed. Touching it now
recompiles only some 1700 objects. For block/aio.h and
qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the
others, they shrink only slightly.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-21-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-12 07:23:50 +02:00
|
|
|
#include "qemu/main-loop.h"
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
#include "iothread.h"
|
|
|
|
|
block: use int64_t instead of uint64_t in driver read handlers
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.
Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.
We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).
So, convert driver read handlers parameters which are already 64bit to
signed type.
While being here, convert also flags parameter to be BdrvRequestFlags.
Now let's consider all callers. Simple
git grep '\->bdrv_\(aio\|co\)_preadv\(_part\)\?'
shows that's there three callers of driver function:
bdrv_driver_preadv() in block/io.c, passes int64_t, checked by
bdrv_check_qiov_request() to be non-negative.
qcow2_load_vmstate() does bdrv_check_qiov_request().
do_perform_cow_read() has uint64_t argument. And a lot of things in
qcow2 driver are uint64_t, so converting it is big job. But we must
not work with requests that don't satisfy bdrv_check_qiov_request(),
so let's just assert it here.
Still, the functions may be called directly, not only by drv->...
Let's check:
git grep '\.bdrv_\(aio\|co\)_preadv\(_part\)\?\s*=' | \
awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \
while read func; do git grep "$func(" | \
grep -v "$func(BlockDriverState"; done
The only one such caller:
QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, &data, 1);
...
ret = bdrv_replace_test_co_preadv(bs, 0, 1, &qiov, 0);
in tests/unit/test-bdrv-drain.c, and it's OK obviously.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210903102807.27127-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: fix typos]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-09-03 12:27:59 +02:00
|
|
|
static int coroutine_fn bdrv_test_co_preadv(BlockDriverState *bs,
|
|
|
|
int64_t offset, int64_t bytes,
|
|
|
|
QEMUIOVector *qiov,
|
|
|
|
BdrvRequestFlags flags)
|
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int coroutine_fn bdrv_test_co_pwritev(BlockDriverState *bs,
|
block: use int64_t instead of uint64_t in driver write handlers
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.
Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.
We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).
So, convert driver write handlers parameters which are already 64bit to
signed type.
While being here, convert also flags parameter to be BdrvRequestFlags.
Now let's consider all callers. Simple
git grep '\->bdrv_\(aio\|co\)_pwritev\(_part\)\?'
shows that's there three callers of driver function:
bdrv_driver_pwritev() and bdrv_driver_pwritev_compressed() in
block/io.c, both pass int64_t, checked by bdrv_check_qiov_request() to
be non-negative.
qcow2_save_vmstate() does bdrv_check_qiov_request().
Still, the functions may be called directly, not only by drv->...
Let's check:
git grep '\.bdrv_\(aio\|co\)_pwritev\(_part\)\?\s*=' | \
awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \
while read func; do git grep "$func(" | \
grep -v "$func(BlockDriverState"; done
shows several callers:
qcow2:
qcow2_co_truncate() write at most up to @offset, which is checked in
generic qcow2_co_truncate() by bdrv_check_request().
qcow2_co_pwritev_compressed_task() pass the request (or part of the
request) that already went through normal write path, so it should
be OK
qcow:
qcow_co_pwritev_compressed() pass int64_t, it's updated by this patch
quorum:
quorum_co_pwrite_zeroes() pass int64_t and int - OK
throttle:
throttle_co_pwritev_compressed() pass int64_t, it's updated by this
patch
vmdk:
vmdk_co_pwritev_compressed() pass int64_t, it's updated by this
patch
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210903102807.27127-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-09-03 12:28:00 +02:00
|
|
|
int64_t offset, int64_t bytes,
|
block: use int64_t instead of uint64_t in driver read handlers
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.
Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.
We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).
So, convert driver read handlers parameters which are already 64bit to
signed type.
While being here, convert also flags parameter to be BdrvRequestFlags.
Now let's consider all callers. Simple
git grep '\->bdrv_\(aio\|co\)_preadv\(_part\)\?'
shows that's there three callers of driver function:
bdrv_driver_preadv() in block/io.c, passes int64_t, checked by
bdrv_check_qiov_request() to be non-negative.
qcow2_load_vmstate() does bdrv_check_qiov_request().
do_perform_cow_read() has uint64_t argument. And a lot of things in
qcow2 driver are uint64_t, so converting it is big job. But we must
not work with requests that don't satisfy bdrv_check_qiov_request(),
so let's just assert it here.
Still, the functions may be called directly, not only by drv->...
Let's check:
git grep '\.bdrv_\(aio\|co\)_preadv\(_part\)\?\s*=' | \
awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \
while read func; do git grep "$func(" | \
grep -v "$func(BlockDriverState"; done
The only one such caller:
QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, &data, 1);
...
ret = bdrv_replace_test_co_preadv(bs, 0, 1, &qiov, 0);
in tests/unit/test-bdrv-drain.c, and it's OK obviously.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210903102807.27127-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: fix typos]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-09-03 12:27:59 +02:00
|
|
|
QEMUIOVector *qiov,
|
block: use int64_t instead of uint64_t in driver write handlers
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.
Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.
We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).
So, convert driver write handlers parameters which are already 64bit to
signed type.
While being here, convert also flags parameter to be BdrvRequestFlags.
Now let's consider all callers. Simple
git grep '\->bdrv_\(aio\|co\)_pwritev\(_part\)\?'
shows that's there three callers of driver function:
bdrv_driver_pwritev() and bdrv_driver_pwritev_compressed() in
block/io.c, both pass int64_t, checked by bdrv_check_qiov_request() to
be non-negative.
qcow2_save_vmstate() does bdrv_check_qiov_request().
Still, the functions may be called directly, not only by drv->...
Let's check:
git grep '\.bdrv_\(aio\|co\)_pwritev\(_part\)\?\s*=' | \
awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \
while read func; do git grep "$func(" | \
grep -v "$func(BlockDriverState"; done
shows several callers:
qcow2:
qcow2_co_truncate() write at most up to @offset, which is checked in
generic qcow2_co_truncate() by bdrv_check_request().
qcow2_co_pwritev_compressed_task() pass the request (or part of the
request) that already went through normal write path, so it should
be OK
qcow:
qcow_co_pwritev_compressed() pass int64_t, it's updated by this patch
quorum:
quorum_co_pwrite_zeroes() pass int64_t and int - OK
throttle:
throttle_co_pwritev_compressed() pass int64_t, it's updated by this
patch
vmdk:
vmdk_co_pwritev_compressed() pass int64_t, it's updated by this
patch
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210903102807.27127-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-09-03 12:28:00 +02:00
|
|
|
BdrvRequestFlags flags)
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int coroutine_fn bdrv_test_co_pdiscard(BlockDriverState *bs,
|
block: use int64_t instead of int in driver discard handlers
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.
Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.
We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).
So, convert driver discard handlers bytes parameter to int64_t.
The only caller of all updated function is bdrv_co_pdiscard in
block/io.c. It is already prepared to work with 64bit requests, but
pass at most max(bs->bl.max_pdiscard, INT_MAX) to the driver.
Let's look at all updated functions:
blkdebug: all calculations are still OK, thanks to
bdrv_check_qiov_request().
both rule_check and bdrv_co_pdiscard are 64bit
blklogwrites: pass to blk_loc_writes_co_log which is 64bit
blkreplay, copy-on-read, filter-compress: pass to bdrv_co_pdiscard, OK
copy-before-write: pass to bdrv_co_pdiscard which is 64bit and to
cbw_do_copy_before_write which is 64bit
file-posix: one handler calls raw_account_discard() is 64bit and both
handlers calls raw_do_pdiscard(). Update raw_do_pdiscard, which pass
to RawPosixAIOData::aio_nbytes, which is 64bit (and calls
raw_account_discard())
gluster: somehow, third argument of glfs_discard_async is size_t.
Let's set max_pdiscard accordingly.
iscsi: iscsi_allocmap_set_invalid is 64bit,
!is_byte_request_lun_aligned is 64bit.
list.num is uint32_t. Let's clarify max_pdiscard and
pdiscard_alignment.
mirror_top: pass to bdrv_mirror_top_do_write() which is
64bit
nbd: protocol limitation. max_pdiscard is alredy set strict enough,
keep it as is for now.
nvme: buf.nlb is uint32_t and we do shift. So, add corresponding limits
to nvme_refresh_limits().
preallocate: pass to bdrv_co_pdiscard() which is 64bit.
rbd: pass to qemu_rbd_start_co() which is 64bit.
qcow2: calculations are still OK, thanks to bdrv_check_qiov_request(),
qcow2_cluster_discard() is 64bit.
raw-format: raw_adjust_offset() is 64bit, bdrv_co_pdiscard too.
throttle: pass to bdrv_co_pdiscard() which is 64bit and to
throttle_group_co_io_limits_intercept() which is 64bit as well.
test-block-iothread: bytes argument is unused
Great! Now all drivers are prepared to handle 64bit discard requests,
or else have explicit max_pdiscard limits.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210903102807.27127-11-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-09-03 12:28:06 +02:00
|
|
|
int64_t offset, int64_t bytes)
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int coroutine_fn
|
2019-09-18 11:51:40 +02:00
|
|
|
bdrv_test_co_truncate(BlockDriverState *bs, int64_t offset, bool exact,
|
2020-04-24 14:54:39 +02:00
|
|
|
PreallocMode prealloc, BdrvRequestFlags flags,
|
|
|
|
Error **errp)
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
{
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int coroutine_fn bdrv_test_co_block_status(BlockDriverState *bs,
|
|
|
|
bool want_zero,
|
|
|
|
int64_t offset, int64_t count,
|
|
|
|
int64_t *pnum, int64_t *map,
|
|
|
|
BlockDriverState **file)
|
|
|
|
{
|
|
|
|
*pnum = count;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static BlockDriver bdrv_test = {
|
|
|
|
.format_name = "test",
|
|
|
|
.instance_size = 1,
|
|
|
|
|
block: use int64_t instead of uint64_t in driver read handlers
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.
Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.
We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).
So, convert driver read handlers parameters which are already 64bit to
signed type.
While being here, convert also flags parameter to be BdrvRequestFlags.
Now let's consider all callers. Simple
git grep '\->bdrv_\(aio\|co\)_preadv\(_part\)\?'
shows that's there three callers of driver function:
bdrv_driver_preadv() in block/io.c, passes int64_t, checked by
bdrv_check_qiov_request() to be non-negative.
qcow2_load_vmstate() does bdrv_check_qiov_request().
do_perform_cow_read() has uint64_t argument. And a lot of things in
qcow2 driver are uint64_t, so converting it is big job. But we must
not work with requests that don't satisfy bdrv_check_qiov_request(),
so let's just assert it here.
Still, the functions may be called directly, not only by drv->...
Let's check:
git grep '\.bdrv_\(aio\|co\)_preadv\(_part\)\?\s*=' | \
awk '{print $4}' | sed 's/,//' | sed 's/&//' | sort | uniq | \
while read func; do git grep "$func(" | \
grep -v "$func(BlockDriverState"; done
The only one such caller:
QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, &data, 1);
...
ret = bdrv_replace_test_co_preadv(bs, 0, 1, &qiov, 0);
in tests/unit/test-bdrv-drain.c, and it's OK obviously.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210903102807.27127-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: fix typos]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-09-03 12:27:59 +02:00
|
|
|
.bdrv_co_preadv = bdrv_test_co_preadv,
|
|
|
|
.bdrv_co_pwritev = bdrv_test_co_pwritev,
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
.bdrv_co_pdiscard = bdrv_test_co_pdiscard,
|
|
|
|
.bdrv_co_truncate = bdrv_test_co_truncate,
|
|
|
|
.bdrv_co_block_status = bdrv_test_co_block_status,
|
|
|
|
};
|
|
|
|
|
|
|
|
static void test_sync_op_pread(BdrvChild *c)
|
|
|
|
{
|
|
|
|
uint8_t buf[512];
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* Success */
|
|
|
|
ret = bdrv_pread(c, 0, buf, sizeof(buf));
|
|
|
|
g_assert_cmpint(ret, ==, 512);
|
|
|
|
|
|
|
|
/* Early error: Negative offset */
|
|
|
|
ret = bdrv_pread(c, -2, buf, sizeof(buf));
|
|
|
|
g_assert_cmpint(ret, ==, -EIO);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_sync_op_pwrite(BdrvChild *c)
|
|
|
|
{
|
2021-03-19 12:22:18 +01:00
|
|
|
uint8_t buf[512] = { 0 };
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* Success */
|
|
|
|
ret = bdrv_pwrite(c, 0, buf, sizeof(buf));
|
|
|
|
g_assert_cmpint(ret, ==, 512);
|
|
|
|
|
|
|
|
/* Early error: Negative offset */
|
|
|
|
ret = bdrv_pwrite(c, -2, buf, sizeof(buf));
|
|
|
|
g_assert_cmpint(ret, ==, -EIO);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_sync_op_blk_pread(BlockBackend *blk)
|
|
|
|
{
|
|
|
|
uint8_t buf[512];
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* Success */
|
|
|
|
ret = blk_pread(blk, 0, buf, sizeof(buf));
|
|
|
|
g_assert_cmpint(ret, ==, 512);
|
|
|
|
|
|
|
|
/* Early error: Negative offset */
|
|
|
|
ret = blk_pread(blk, -2, buf, sizeof(buf));
|
|
|
|
g_assert_cmpint(ret, ==, -EIO);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_sync_op_blk_pwrite(BlockBackend *blk)
|
|
|
|
{
|
2021-03-19 12:22:18 +01:00
|
|
|
uint8_t buf[512] = { 0 };
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* Success */
|
|
|
|
ret = blk_pwrite(blk, 0, buf, sizeof(buf), 0);
|
|
|
|
g_assert_cmpint(ret, ==, 512);
|
|
|
|
|
|
|
|
/* Early error: Negative offset */
|
|
|
|
ret = blk_pwrite(blk, -2, buf, sizeof(buf), 0);
|
|
|
|
g_assert_cmpint(ret, ==, -EIO);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_sync_op_load_vmstate(BdrvChild *c)
|
|
|
|
{
|
|
|
|
uint8_t buf[512];
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* Error: Driver does not support snapshots */
|
|
|
|
ret = bdrv_load_vmstate(c->bs, buf, 0, sizeof(buf));
|
|
|
|
g_assert_cmpint(ret, ==, -ENOTSUP);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_sync_op_save_vmstate(BdrvChild *c)
|
|
|
|
{
|
2021-03-19 12:22:18 +01:00
|
|
|
uint8_t buf[512] = { 0 };
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* Error: Driver does not support snapshots */
|
|
|
|
ret = bdrv_save_vmstate(c->bs, buf, 0, sizeof(buf));
|
|
|
|
g_assert_cmpint(ret, ==, -ENOTSUP);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_sync_op_pdiscard(BdrvChild *c)
|
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* Normal success path */
|
|
|
|
c->bs->open_flags |= BDRV_O_UNMAP;
|
|
|
|
ret = bdrv_pdiscard(c, 0, 512);
|
|
|
|
g_assert_cmpint(ret, ==, 0);
|
|
|
|
|
|
|
|
/* Early success: UNMAP not supported */
|
|
|
|
c->bs->open_flags &= ~BDRV_O_UNMAP;
|
|
|
|
ret = bdrv_pdiscard(c, 0, 512);
|
|
|
|
g_assert_cmpint(ret, ==, 0);
|
|
|
|
|
|
|
|
/* Early error: Negative offset */
|
|
|
|
ret = bdrv_pdiscard(c, -2, 512);
|
|
|
|
g_assert_cmpint(ret, ==, -EIO);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_sync_op_blk_pdiscard(BlockBackend *blk)
|
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* Early success: UNMAP not supported */
|
|
|
|
ret = blk_pdiscard(blk, 0, 512);
|
|
|
|
g_assert_cmpint(ret, ==, 0);
|
|
|
|
|
|
|
|
/* Early error: Negative offset */
|
|
|
|
ret = blk_pdiscard(blk, -2, 512);
|
|
|
|
g_assert_cmpint(ret, ==, -EIO);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_sync_op_truncate(BdrvChild *c)
|
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* Normal success path */
|
2020-04-24 14:54:40 +02:00
|
|
|
ret = bdrv_truncate(c, 65536, false, PREALLOC_MODE_OFF, 0, NULL);
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
g_assert_cmpint(ret, ==, 0);
|
|
|
|
|
|
|
|
/* Early error: Negative offset */
|
2020-04-24 14:54:40 +02:00
|
|
|
ret = bdrv_truncate(c, -2, false, PREALLOC_MODE_OFF, 0, NULL);
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
g_assert_cmpint(ret, ==, -EINVAL);
|
|
|
|
|
|
|
|
/* Error: Read-only image */
|
|
|
|
c->bs->open_flags &= ~BDRV_O_RDWR;
|
|
|
|
|
2020-04-24 14:54:40 +02:00
|
|
|
ret = bdrv_truncate(c, 65536, false, PREALLOC_MODE_OFF, 0, NULL);
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
g_assert_cmpint(ret, ==, -EACCES);
|
|
|
|
|
|
|
|
c->bs->open_flags |= BDRV_O_RDWR;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_sync_op_block_status(BdrvChild *c)
|
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
int64_t n;
|
|
|
|
|
|
|
|
/* Normal success path */
|
|
|
|
ret = bdrv_is_allocated(c->bs, 0, 65536, &n);
|
|
|
|
g_assert_cmpint(ret, ==, 0);
|
|
|
|
|
|
|
|
/* Early success: No driver support */
|
|
|
|
bdrv_test.bdrv_co_block_status = NULL;
|
|
|
|
ret = bdrv_is_allocated(c->bs, 0, 65536, &n);
|
|
|
|
g_assert_cmpint(ret, ==, 1);
|
|
|
|
|
|
|
|
/* Early success: bytes = 0 */
|
|
|
|
ret = bdrv_is_allocated(c->bs, 0, 0, &n);
|
|
|
|
g_assert_cmpint(ret, ==, 0);
|
|
|
|
|
|
|
|
/* Early success: Offset > image size*/
|
|
|
|
ret = bdrv_is_allocated(c->bs, 0x1000000, 0x1000000, &n);
|
|
|
|
g_assert_cmpint(ret, ==, 0);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_sync_op_flush(BdrvChild *c)
|
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* Normal success path */
|
|
|
|
ret = bdrv_flush(c->bs);
|
|
|
|
g_assert_cmpint(ret, ==, 0);
|
|
|
|
|
|
|
|
/* Early success: Read-only image */
|
|
|
|
c->bs->open_flags &= ~BDRV_O_RDWR;
|
|
|
|
|
|
|
|
ret = bdrv_flush(c->bs);
|
|
|
|
g_assert_cmpint(ret, ==, 0);
|
|
|
|
|
|
|
|
c->bs->open_flags |= BDRV_O_RDWR;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_sync_op_blk_flush(BlockBackend *blk)
|
|
|
|
{
|
|
|
|
BlockDriverState *bs = blk_bs(blk);
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* Normal success path */
|
|
|
|
ret = blk_flush(blk);
|
|
|
|
g_assert_cmpint(ret, ==, 0);
|
|
|
|
|
|
|
|
/* Early success: Read-only image */
|
|
|
|
bs->open_flags &= ~BDRV_O_RDWR;
|
|
|
|
|
|
|
|
ret = blk_flush(blk);
|
|
|
|
g_assert_cmpint(ret, ==, 0);
|
|
|
|
|
|
|
|
bs->open_flags |= BDRV_O_RDWR;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_sync_op_check(BdrvChild *c)
|
|
|
|
{
|
|
|
|
BdrvCheckResult result;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
/* Error: Driver does not implement check */
|
|
|
|
ret = bdrv_check(c->bs, &result, 0);
|
|
|
|
g_assert_cmpint(ret, ==, -ENOTSUP);
|
|
|
|
}
|
|
|
|
|
2022-02-09 11:54:51 +01:00
|
|
|
static void test_sync_op_activate(BdrvChild *c)
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
{
|
|
|
|
/* Early success: Image is not inactive */
|
2022-02-09 11:54:50 +01:00
|
|
|
bdrv_activate(c->bs, NULL);
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
}
|
|
|
|
|
|
|
|
|
|
|
|
typedef struct SyncOpTest {
|
|
|
|
const char *name;
|
|
|
|
void (*fn)(BdrvChild *c);
|
|
|
|
void (*blkfn)(BlockBackend *blk);
|
|
|
|
} SyncOpTest;
|
|
|
|
|
|
|
|
const SyncOpTest sync_op_tests[] = {
|
|
|
|
{
|
|
|
|
.name = "/sync-op/pread",
|
|
|
|
.fn = test_sync_op_pread,
|
|
|
|
.blkfn = test_sync_op_blk_pread,
|
|
|
|
}, {
|
|
|
|
.name = "/sync-op/pwrite",
|
|
|
|
.fn = test_sync_op_pwrite,
|
|
|
|
.blkfn = test_sync_op_blk_pwrite,
|
|
|
|
}, {
|
|
|
|
.name = "/sync-op/load_vmstate",
|
|
|
|
.fn = test_sync_op_load_vmstate,
|
|
|
|
}, {
|
|
|
|
.name = "/sync-op/save_vmstate",
|
|
|
|
.fn = test_sync_op_save_vmstate,
|
|
|
|
}, {
|
|
|
|
.name = "/sync-op/pdiscard",
|
|
|
|
.fn = test_sync_op_pdiscard,
|
|
|
|
.blkfn = test_sync_op_blk_pdiscard,
|
|
|
|
}, {
|
|
|
|
.name = "/sync-op/truncate",
|
|
|
|
.fn = test_sync_op_truncate,
|
|
|
|
}, {
|
|
|
|
.name = "/sync-op/block_status",
|
|
|
|
.fn = test_sync_op_block_status,
|
|
|
|
}, {
|
|
|
|
.name = "/sync-op/flush",
|
|
|
|
.fn = test_sync_op_flush,
|
|
|
|
.blkfn = test_sync_op_blk_flush,
|
|
|
|
}, {
|
|
|
|
.name = "/sync-op/check",
|
|
|
|
.fn = test_sync_op_check,
|
|
|
|
}, {
|
2022-02-09 11:54:51 +01:00
|
|
|
.name = "/sync-op/activate",
|
|
|
|
.fn = test_sync_op_activate,
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
},
|
|
|
|
};
|
|
|
|
|
|
|
|
/* Test synchronous operations that run in a different iothread, so we have to
|
|
|
|
* poll for the coroutine there to return. */
|
|
|
|
static void test_sync_op(const void *opaque)
|
|
|
|
{
|
|
|
|
const SyncOpTest *t = opaque;
|
|
|
|
IOThread *iothread = iothread_new();
|
|
|
|
AioContext *ctx = iothread_get_aio_context(iothread);
|
|
|
|
BlockBackend *blk;
|
|
|
|
BlockDriverState *bs;
|
|
|
|
BdrvChild *c;
|
|
|
|
|
2019-04-25 14:25:10 +02:00
|
|
|
blk = blk_new(qemu_get_aio_context(), BLK_PERM_ALL, BLK_PERM_ALL);
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
bs = bdrv_new_open_driver(&bdrv_test, "base", BDRV_O_RDWR, &error_abort);
|
|
|
|
bs->total_sectors = 65536 / BDRV_SECTOR_SIZE;
|
|
|
|
blk_insert_bs(blk, bs, &error_abort);
|
|
|
|
c = QLIST_FIRST(&bs->parents);
|
|
|
|
|
2019-05-02 11:10:59 +02:00
|
|
|
blk_set_aio_context(blk, ctx, &error_abort);
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
aio_context_acquire(ctx);
|
|
|
|
t->fn(c);
|
|
|
|
if (t->blkfn) {
|
|
|
|
t->blkfn(blk);
|
|
|
|
}
|
2019-05-02 11:10:59 +02:00
|
|
|
blk_set_aio_context(blk, qemu_get_aio_context(), &error_abort);
|
2019-07-19 11:26:13 +02:00
|
|
|
aio_context_release(ctx);
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
|
|
|
|
bdrv_unref(bs);
|
|
|
|
blk_unref(blk);
|
|
|
|
}
|
|
|
|
|
2019-05-03 19:17:44 +02:00
|
|
|
typedef struct TestBlockJob {
|
|
|
|
BlockJob common;
|
|
|
|
bool should_complete;
|
|
|
|
int n;
|
|
|
|
} TestBlockJob;
|
|
|
|
|
|
|
|
static int test_job_prepare(Job *job)
|
|
|
|
{
|
|
|
|
g_assert(qemu_get_current_aio_context() == qemu_get_aio_context());
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static int coroutine_fn test_job_run(Job *job, Error **errp)
|
|
|
|
{
|
|
|
|
TestBlockJob *s = container_of(job, TestBlockJob, common.job);
|
|
|
|
|
|
|
|
job_transition_to_ready(&s->common.job);
|
|
|
|
while (!s->should_complete) {
|
|
|
|
s->n++;
|
|
|
|
g_assert(qemu_get_current_aio_context() == job->aio_context);
|
|
|
|
|
|
|
|
/* Avoid job_sleep_ns() because it marks the job as !busy. We want to
|
|
|
|
* emulate some actual activity (probably some I/O) here so that the
|
|
|
|
* drain involved in AioContext switches has to wait for this activity
|
|
|
|
* to stop. */
|
|
|
|
qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 1000000);
|
|
|
|
|
|
|
|
job_pause_point(&s->common.job);
|
|
|
|
}
|
|
|
|
|
|
|
|
g_assert(qemu_get_current_aio_context() == job->aio_context);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void test_job_complete(Job *job, Error **errp)
|
|
|
|
{
|
|
|
|
TestBlockJob *s = container_of(job, TestBlockJob, common.job);
|
|
|
|
s->should_complete = true;
|
|
|
|
}
|
|
|
|
|
|
|
|
BlockJobDriver test_job_driver = {
|
|
|
|
.job_driver = {
|
|
|
|
.instance_size = sizeof(TestBlockJob),
|
|
|
|
.free = block_job_free,
|
|
|
|
.user_resume = block_job_user_resume,
|
|
|
|
.run = test_job_run,
|
|
|
|
.complete = test_job_complete,
|
|
|
|
.prepare = test_job_prepare,
|
|
|
|
},
|
|
|
|
};
|
|
|
|
|
|
|
|
static void test_attach_blockjob(void)
|
|
|
|
{
|
|
|
|
IOThread *iothread = iothread_new();
|
|
|
|
AioContext *ctx = iothread_get_aio_context(iothread);
|
|
|
|
BlockBackend *blk;
|
|
|
|
BlockDriverState *bs;
|
|
|
|
TestBlockJob *tjob;
|
|
|
|
|
2019-04-25 14:25:10 +02:00
|
|
|
blk = blk_new(qemu_get_aio_context(), BLK_PERM_ALL, BLK_PERM_ALL);
|
2019-05-03 19:17:44 +02:00
|
|
|
bs = bdrv_new_open_driver(&bdrv_test, "base", BDRV_O_RDWR, &error_abort);
|
|
|
|
blk_insert_bs(blk, bs, &error_abort);
|
|
|
|
|
|
|
|
tjob = block_job_create("job0", &test_job_driver, NULL, bs,
|
|
|
|
0, BLK_PERM_ALL,
|
|
|
|
0, 0, NULL, NULL, &error_abort);
|
|
|
|
job_start(&tjob->common.job);
|
|
|
|
|
|
|
|
while (tjob->n == 0) {
|
|
|
|
aio_poll(qemu_get_aio_context(), false);
|
|
|
|
}
|
|
|
|
|
2019-05-02 11:10:59 +02:00
|
|
|
blk_set_aio_context(blk, ctx, &error_abort);
|
2019-05-03 19:17:44 +02:00
|
|
|
|
|
|
|
tjob->n = 0;
|
|
|
|
while (tjob->n == 0) {
|
|
|
|
aio_poll(qemu_get_aio_context(), false);
|
|
|
|
}
|
|
|
|
|
|
|
|
aio_context_acquire(ctx);
|
2019-05-02 11:10:59 +02:00
|
|
|
blk_set_aio_context(blk, qemu_get_aio_context(), &error_abort);
|
2019-05-03 19:17:44 +02:00
|
|
|
aio_context_release(ctx);
|
|
|
|
|
|
|
|
tjob->n = 0;
|
|
|
|
while (tjob->n == 0) {
|
|
|
|
aio_poll(qemu_get_aio_context(), false);
|
|
|
|
}
|
|
|
|
|
2019-05-02 11:10:59 +02:00
|
|
|
blk_set_aio_context(blk, ctx, &error_abort);
|
2019-05-03 19:17:44 +02:00
|
|
|
|
|
|
|
tjob->n = 0;
|
|
|
|
while (tjob->n == 0) {
|
|
|
|
aio_poll(qemu_get_aio_context(), false);
|
|
|
|
}
|
|
|
|
|
|
|
|
aio_context_acquire(ctx);
|
|
|
|
job_complete_sync(&tjob->common.job, &error_abort);
|
2019-05-02 11:10:59 +02:00
|
|
|
blk_set_aio_context(blk, qemu_get_aio_context(), &error_abort);
|
2019-05-03 19:17:44 +02:00
|
|
|
aio_context_release(ctx);
|
|
|
|
|
|
|
|
bdrv_unref(bs);
|
|
|
|
blk_unref(blk);
|
|
|
|
}
|
|
|
|
|
2019-05-06 19:18:00 +02:00
|
|
|
/*
|
|
|
|
* Test that changing the AioContext for one node in a tree (here through blk)
|
|
|
|
* changes all other nodes as well:
|
|
|
|
*
|
|
|
|
* blk
|
|
|
|
* |
|
|
|
|
* | bs_verify [blkverify]
|
|
|
|
* | / \
|
|
|
|
* | / \
|
|
|
|
* bs_a [bdrv_test] bs_b [bdrv_test]
|
|
|
|
*
|
|
|
|
*/
|
|
|
|
static void test_propagate_basic(void)
|
|
|
|
{
|
|
|
|
IOThread *iothread = iothread_new();
|
|
|
|
AioContext *ctx = iothread_get_aio_context(iothread);
|
2019-07-19 11:26:13 +02:00
|
|
|
AioContext *main_ctx;
|
2019-05-06 19:18:00 +02:00
|
|
|
BlockBackend *blk;
|
|
|
|
BlockDriverState *bs_a, *bs_b, *bs_verify;
|
|
|
|
QDict *options;
|
|
|
|
|
2020-05-13 13:05:39 +02:00
|
|
|
/*
|
|
|
|
* Create bs_a and its BlockBackend. We cannot take the RESIZE
|
|
|
|
* permission because blkverify will not share it on the test
|
|
|
|
* image.
|
|
|
|
*/
|
|
|
|
blk = blk_new(qemu_get_aio_context(), BLK_PERM_ALL & ~BLK_PERM_RESIZE,
|
|
|
|
BLK_PERM_ALL);
|
2019-05-06 19:18:00 +02:00
|
|
|
bs_a = bdrv_new_open_driver(&bdrv_test, "bs_a", BDRV_O_RDWR, &error_abort);
|
|
|
|
blk_insert_bs(blk, bs_a, &error_abort);
|
|
|
|
|
|
|
|
/* Create bs_b */
|
|
|
|
bs_b = bdrv_new_open_driver(&bdrv_test, "bs_b", BDRV_O_RDWR, &error_abort);
|
|
|
|
|
|
|
|
/* Create blkverify filter that references both bs_a and bs_b */
|
|
|
|
options = qdict_new();
|
|
|
|
qdict_put_str(options, "driver", "blkverify");
|
|
|
|
qdict_put_str(options, "test", "bs_a");
|
|
|
|
qdict_put_str(options, "raw", "bs_b");
|
|
|
|
|
|
|
|
bs_verify = bdrv_open(NULL, NULL, options, BDRV_O_RDWR, &error_abort);
|
|
|
|
|
|
|
|
/* Switch the AioContext */
|
2019-05-02 11:10:59 +02:00
|
|
|
blk_set_aio_context(blk, ctx, &error_abort);
|
2019-05-06 19:18:00 +02:00
|
|
|
g_assert(blk_get_aio_context(blk) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_a) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_verify) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_b) == ctx);
|
|
|
|
|
|
|
|
/* Switch the AioContext back */
|
2019-07-19 11:26:13 +02:00
|
|
|
main_ctx = qemu_get_aio_context();
|
|
|
|
aio_context_acquire(ctx);
|
|
|
|
blk_set_aio_context(blk, main_ctx, &error_abort);
|
|
|
|
aio_context_release(ctx);
|
|
|
|
g_assert(blk_get_aio_context(blk) == main_ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_a) == main_ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_verify) == main_ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_b) == main_ctx);
|
2019-05-06 19:18:00 +02:00
|
|
|
|
|
|
|
bdrv_unref(bs_verify);
|
|
|
|
bdrv_unref(bs_b);
|
|
|
|
bdrv_unref(bs_a);
|
|
|
|
blk_unref(blk);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Test that diamonds in the graph don't lead to endless recursion:
|
|
|
|
*
|
|
|
|
* blk
|
|
|
|
* |
|
|
|
|
* bs_verify [blkverify]
|
|
|
|
* / \
|
|
|
|
* / \
|
|
|
|
* bs_b [raw] bs_c[raw]
|
|
|
|
* \ /
|
|
|
|
* \ /
|
|
|
|
* bs_a [bdrv_test]
|
|
|
|
*/
|
|
|
|
static void test_propagate_diamond(void)
|
|
|
|
{
|
|
|
|
IOThread *iothread = iothread_new();
|
|
|
|
AioContext *ctx = iothread_get_aio_context(iothread);
|
2019-07-19 11:26:13 +02:00
|
|
|
AioContext *main_ctx;
|
2019-05-06 19:18:00 +02:00
|
|
|
BlockBackend *blk;
|
|
|
|
BlockDriverState *bs_a, *bs_b, *bs_c, *bs_verify;
|
|
|
|
QDict *options;
|
|
|
|
|
|
|
|
/* Create bs_a */
|
|
|
|
bs_a = bdrv_new_open_driver(&bdrv_test, "bs_a", BDRV_O_RDWR, &error_abort);
|
|
|
|
|
|
|
|
/* Create bs_b and bc_c */
|
|
|
|
options = qdict_new();
|
|
|
|
qdict_put_str(options, "driver", "raw");
|
|
|
|
qdict_put_str(options, "file", "bs_a");
|
|
|
|
qdict_put_str(options, "node-name", "bs_b");
|
|
|
|
bs_b = bdrv_open(NULL, NULL, options, BDRV_O_RDWR, &error_abort);
|
|
|
|
|
|
|
|
options = qdict_new();
|
|
|
|
qdict_put_str(options, "driver", "raw");
|
|
|
|
qdict_put_str(options, "file", "bs_a");
|
|
|
|
qdict_put_str(options, "node-name", "bs_c");
|
|
|
|
bs_c = bdrv_open(NULL, NULL, options, BDRV_O_RDWR, &error_abort);
|
|
|
|
|
|
|
|
/* Create blkverify filter that references both bs_b and bs_c */
|
|
|
|
options = qdict_new();
|
|
|
|
qdict_put_str(options, "driver", "blkverify");
|
|
|
|
qdict_put_str(options, "test", "bs_b");
|
|
|
|
qdict_put_str(options, "raw", "bs_c");
|
|
|
|
|
|
|
|
bs_verify = bdrv_open(NULL, NULL, options, BDRV_O_RDWR, &error_abort);
|
2020-05-13 13:05:39 +02:00
|
|
|
/*
|
|
|
|
* Do not take the RESIZE permission: This would require the same
|
|
|
|
* from bs_c and thus from bs_a; however, blkverify will not share
|
|
|
|
* it on bs_b, and thus it will not be available for bs_a.
|
|
|
|
*/
|
|
|
|
blk = blk_new(qemu_get_aio_context(), BLK_PERM_ALL & ~BLK_PERM_RESIZE,
|
|
|
|
BLK_PERM_ALL);
|
2019-05-06 19:18:00 +02:00
|
|
|
blk_insert_bs(blk, bs_verify, &error_abort);
|
|
|
|
|
|
|
|
/* Switch the AioContext */
|
2019-05-02 11:10:59 +02:00
|
|
|
blk_set_aio_context(blk, ctx, &error_abort);
|
2019-05-06 19:18:00 +02:00
|
|
|
g_assert(blk_get_aio_context(blk) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_verify) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_a) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_b) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_c) == ctx);
|
|
|
|
|
|
|
|
/* Switch the AioContext back */
|
2019-07-19 11:26:13 +02:00
|
|
|
main_ctx = qemu_get_aio_context();
|
|
|
|
aio_context_acquire(ctx);
|
|
|
|
blk_set_aio_context(blk, main_ctx, &error_abort);
|
|
|
|
aio_context_release(ctx);
|
|
|
|
g_assert(blk_get_aio_context(blk) == main_ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_verify) == main_ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_a) == main_ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_b) == main_ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs_c) == main_ctx);
|
2019-05-06 19:18:00 +02:00
|
|
|
|
|
|
|
blk_unref(blk);
|
|
|
|
bdrv_unref(bs_verify);
|
|
|
|
bdrv_unref(bs_c);
|
|
|
|
bdrv_unref(bs_b);
|
|
|
|
bdrv_unref(bs_a);
|
|
|
|
}
|
|
|
|
|
2019-05-06 19:18:05 +02:00
|
|
|
static void test_propagate_mirror(void)
|
|
|
|
{
|
|
|
|
IOThread *iothread = iothread_new();
|
|
|
|
AioContext *ctx = iothread_get_aio_context(iothread);
|
|
|
|
AioContext *main_ctx = qemu_get_aio_context();
|
2019-05-03 13:20:54 +02:00
|
|
|
BlockDriverState *src, *target, *filter;
|
2019-05-06 19:18:05 +02:00
|
|
|
BlockBackend *blk;
|
|
|
|
Job *job;
|
|
|
|
Error *local_err = NULL;
|
|
|
|
|
|
|
|
/* Create src and target*/
|
|
|
|
src = bdrv_new_open_driver(&bdrv_test, "src", BDRV_O_RDWR, &error_abort);
|
|
|
|
target = bdrv_new_open_driver(&bdrv_test, "target", BDRV_O_RDWR,
|
|
|
|
&error_abort);
|
|
|
|
|
|
|
|
/* Start a mirror job */
|
|
|
|
mirror_start("job0", src, target, NULL, JOB_DEFAULT, 0, 0, 0,
|
2019-07-24 19:12:30 +02:00
|
|
|
MIRROR_SYNC_MODE_NONE, MIRROR_OPEN_BACKING_CHAIN, false,
|
2019-05-06 19:18:05 +02:00
|
|
|
BLOCKDEV_ON_ERROR_REPORT, BLOCKDEV_ON_ERROR_REPORT,
|
|
|
|
false, "filter_node", MIRROR_COPY_MODE_BACKGROUND,
|
|
|
|
&error_abort);
|
|
|
|
job = job_get("job0");
|
2019-05-03 13:20:54 +02:00
|
|
|
filter = bdrv_find_node("filter_node");
|
2019-05-06 19:18:05 +02:00
|
|
|
|
|
|
|
/* Change the AioContext of src */
|
|
|
|
bdrv_try_set_aio_context(src, ctx, &error_abort);
|
|
|
|
g_assert(bdrv_get_aio_context(src) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(target) == ctx);
|
2019-05-03 13:20:54 +02:00
|
|
|
g_assert(bdrv_get_aio_context(filter) == ctx);
|
2019-05-06 19:18:05 +02:00
|
|
|
g_assert(job->aio_context == ctx);
|
|
|
|
|
|
|
|
/* Change the AioContext of target */
|
|
|
|
aio_context_acquire(ctx);
|
|
|
|
bdrv_try_set_aio_context(target, main_ctx, &error_abort);
|
|
|
|
aio_context_release(ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(src) == main_ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(target) == main_ctx);
|
2019-05-03 13:20:54 +02:00
|
|
|
g_assert(bdrv_get_aio_context(filter) == main_ctx);
|
2019-05-06 19:18:05 +02:00
|
|
|
|
|
|
|
/* With a BlockBackend on src, changing target must fail */
|
2019-04-25 14:25:10 +02:00
|
|
|
blk = blk_new(qemu_get_aio_context(), 0, BLK_PERM_ALL);
|
2019-05-06 19:18:05 +02:00
|
|
|
blk_insert_bs(blk, src, &error_abort);
|
|
|
|
|
|
|
|
bdrv_try_set_aio_context(target, ctx, &local_err);
|
2020-06-30 11:03:30 +02:00
|
|
|
error_free_or_abort(&local_err);
|
2019-05-06 19:18:05 +02:00
|
|
|
|
|
|
|
g_assert(blk_get_aio_context(blk) == main_ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(src) == main_ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(target) == main_ctx);
|
2019-05-03 13:20:54 +02:00
|
|
|
g_assert(bdrv_get_aio_context(filter) == main_ctx);
|
2019-05-06 19:18:05 +02:00
|
|
|
|
|
|
|
/* ...unless we explicitly allow it */
|
|
|
|
aio_context_acquire(ctx);
|
|
|
|
blk_set_allow_aio_context_change(blk, true);
|
|
|
|
bdrv_try_set_aio_context(target, ctx, &error_abort);
|
|
|
|
aio_context_release(ctx);
|
|
|
|
|
|
|
|
g_assert(blk_get_aio_context(blk) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(src) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(target) == ctx);
|
2019-05-03 13:20:54 +02:00
|
|
|
g_assert(bdrv_get_aio_context(filter) == ctx);
|
2019-05-06 19:18:05 +02:00
|
|
|
|
|
|
|
job_cancel_sync_all();
|
|
|
|
|
|
|
|
aio_context_acquire(ctx);
|
2019-05-02 11:10:59 +02:00
|
|
|
blk_set_aio_context(blk, main_ctx, &error_abort);
|
2019-05-06 19:18:05 +02:00
|
|
|
bdrv_try_set_aio_context(target, main_ctx, &error_abort);
|
|
|
|
aio_context_release(ctx);
|
|
|
|
|
|
|
|
blk_unref(blk);
|
|
|
|
bdrv_unref(src);
|
|
|
|
bdrv_unref(target);
|
|
|
|
}
|
|
|
|
|
2019-04-24 17:49:33 +02:00
|
|
|
static void test_attach_second_node(void)
|
|
|
|
{
|
|
|
|
IOThread *iothread = iothread_new();
|
|
|
|
AioContext *ctx = iothread_get_aio_context(iothread);
|
|
|
|
AioContext *main_ctx = qemu_get_aio_context();
|
|
|
|
BlockBackend *blk;
|
|
|
|
BlockDriverState *bs, *filter;
|
|
|
|
QDict *options;
|
|
|
|
|
|
|
|
blk = blk_new(ctx, BLK_PERM_ALL, BLK_PERM_ALL);
|
|
|
|
bs = bdrv_new_open_driver(&bdrv_test, "base", BDRV_O_RDWR, &error_abort);
|
|
|
|
blk_insert_bs(blk, bs, &error_abort);
|
|
|
|
|
|
|
|
options = qdict_new();
|
|
|
|
qdict_put_str(options, "driver", "raw");
|
|
|
|
qdict_put_str(options, "file", "base");
|
|
|
|
|
|
|
|
filter = bdrv_open(NULL, NULL, options, BDRV_O_RDWR, &error_abort);
|
|
|
|
g_assert(blk_get_aio_context(blk) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(filter) == ctx);
|
|
|
|
|
2019-07-19 11:26:13 +02:00
|
|
|
aio_context_acquire(ctx);
|
2019-04-24 17:49:33 +02:00
|
|
|
blk_set_aio_context(blk, main_ctx, &error_abort);
|
2019-07-19 11:26:13 +02:00
|
|
|
aio_context_release(ctx);
|
2019-04-24 17:49:33 +02:00
|
|
|
g_assert(blk_get_aio_context(blk) == main_ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs) == main_ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(filter) == main_ctx);
|
|
|
|
|
|
|
|
bdrv_unref(filter);
|
|
|
|
bdrv_unref(bs);
|
|
|
|
blk_unref(blk);
|
|
|
|
}
|
|
|
|
|
2019-04-24 17:47:34 +02:00
|
|
|
static void test_attach_preserve_blk_ctx(void)
|
|
|
|
{
|
|
|
|
IOThread *iothread = iothread_new();
|
|
|
|
AioContext *ctx = iothread_get_aio_context(iothread);
|
|
|
|
BlockBackend *blk;
|
|
|
|
BlockDriverState *bs;
|
|
|
|
|
|
|
|
blk = blk_new(ctx, BLK_PERM_ALL, BLK_PERM_ALL);
|
|
|
|
bs = bdrv_new_open_driver(&bdrv_test, "base", BDRV_O_RDWR, &error_abort);
|
|
|
|
bs->total_sectors = 65536 / BDRV_SECTOR_SIZE;
|
|
|
|
|
|
|
|
/* Add node to BlockBackend that has an iothread context assigned */
|
|
|
|
blk_insert_bs(blk, bs, &error_abort);
|
|
|
|
g_assert(blk_get_aio_context(blk) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs) == ctx);
|
|
|
|
|
|
|
|
/* Remove the node again */
|
2019-07-19 11:26:13 +02:00
|
|
|
aio_context_acquire(ctx);
|
2019-04-24 17:47:34 +02:00
|
|
|
blk_remove_bs(blk);
|
2019-07-19 11:26:13 +02:00
|
|
|
aio_context_release(ctx);
|
2019-04-24 17:47:34 +02:00
|
|
|
g_assert(blk_get_aio_context(blk) == ctx);
|
2019-04-24 18:04:42 +02:00
|
|
|
g_assert(bdrv_get_aio_context(bs) == qemu_get_aio_context());
|
2019-04-24 17:47:34 +02:00
|
|
|
|
|
|
|
/* Re-attach the node */
|
|
|
|
blk_insert_bs(blk, bs, &error_abort);
|
|
|
|
g_assert(blk_get_aio_context(blk) == ctx);
|
|
|
|
g_assert(bdrv_get_aio_context(bs) == ctx);
|
|
|
|
|
2019-07-19 11:26:13 +02:00
|
|
|
aio_context_acquire(ctx);
|
2019-04-24 17:47:34 +02:00
|
|
|
blk_set_aio_context(blk, qemu_get_aio_context(), &error_abort);
|
2019-07-19 11:26:13 +02:00
|
|
|
aio_context_release(ctx);
|
2019-04-24 17:47:34 +02:00
|
|
|
bdrv_unref(bs);
|
|
|
|
blk_unref(blk);
|
|
|
|
}
|
|
|
|
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
int main(int argc, char **argv)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
bdrv_init();
|
|
|
|
qemu_init_main_loop(&error_abort);
|
|
|
|
|
|
|
|
g_test_init(&argc, &argv, NULL);
|
|
|
|
|
|
|
|
for (i = 0; i < ARRAY_SIZE(sync_op_tests); i++) {
|
|
|
|
const SyncOpTest *t = &sync_op_tests[i];
|
|
|
|
g_test_add_data_func(t->name, t, test_sync_op);
|
|
|
|
}
|
|
|
|
|
2019-05-03 19:17:44 +02:00
|
|
|
g_test_add_func("/attach/blockjob", test_attach_blockjob);
|
2019-04-24 17:49:33 +02:00
|
|
|
g_test_add_func("/attach/second_node", test_attach_second_node);
|
2019-04-24 17:47:34 +02:00
|
|
|
g_test_add_func("/attach/preserve_blk_ctx", test_attach_preserve_blk_ctx);
|
2019-05-06 19:18:00 +02:00
|
|
|
g_test_add_func("/propagate/basic", test_propagate_basic);
|
|
|
|
g_test_add_func("/propagate/diamond", test_propagate_diamond);
|
2019-05-06 19:18:05 +02:00
|
|
|
g_test_add_func("/propagate/mirror", test_propagate_mirror);
|
2019-05-03 19:17:44 +02:00
|
|
|
|
block: Fix hangs in synchronous APIs with iothreads
In the block layer, synchronous APIs are often implemented by creating a
coroutine that calls the asynchronous coroutine-based implementation and
then waiting for completion with BDRV_POLL_WHILE().
For this to work with iothreads (more specifically, when the synchronous
API is called in a thread that is not the home thread of the block
device, so that the coroutine will run in a different thread), we must
make sure to call aio_wait_kick() at the end of the operation. Many
places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
the condition has long become false.
Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
generally not enough for most other operations because they haven't set
the return value in the coroutine entry stub yet. To avoid race
conditions there, we need to kick after setting the return value.
The race window is small enough that the problem doesn't usually surface
in the common path. However, it does surface and causes easily
reproducible hangs if the operation can return early before even calling
bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
success paths).
The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
slightly different: These functions even neglected to schedule the
coroutine in the home thread of the node. This avoids the hang, but is
obviously wrong, too. Fix those to schedule the coroutine in the right
AioContext in addition to adding aio_wait_kick() calls.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-01-07 13:02:48 +01:00
|
|
|
return g_test_run();
|
|
|
|
}
|