We have two drivers (iscsi and file-posix) that (in some cases) return
success from their .bdrv_co_truncate() implementation if the block
device is larger than the requested offset, but cannot be shrunk. Some
callers do not want that behavior, so this patch adds a new parameter
that they can use to turn off that behavior.
This patch just adds the parameter and lets the block/io.c and
block/block-backend.c functions pass it around. All other callers
always pass false and none of the implementations evaluate it, so that
this patch does not change existing behavior. Future patches take care
of that.
Suggested-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190918095144.955-5-mreitz@redhat.com
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
There is no reason why the format drivers need to truncate the protocol
node when formatting it. When using the old .bdrv_co_create_ops()
interface, the file will be created with no size option anyway, which
generally gives it a size of 0. (Exceptions are block devices, which
cannot be truncated anyway.)
When using blockdev-create, the user must have given the file node some
size anyway, so there is no reason why we should override that.
qed is an exception, it needs the file to start completely empty (as
explained by c743849bee).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190918095144.955-4-mreitz@redhat.com
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
No other filter driver has a .bdrv_co_truncate() implementation, and
there is no need to because the general block layer code can handle it
just as well.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190918095144.955-3-mreitz@redhat.com
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Make the filter truncation (passing it through to bs->file) a
first-class citizen and handle it exactly as if it was the filter
driver's native implementation of .bdrv_co_truncate().
I do not see a reason not to, it makes the code a bit shorter, and may
be even more correct because this gets us to finish the write_req that
we prepared before (may be important to e.g. bring dirty bitmaps to the
correct size).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190918095144.955-2-mreitz@redhat.com
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
qcow2 v3 images require every snapshot table entry to have at least 16
bytes of extra data. If they do not, let qemu-img check -r all fix it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-15-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The user cannot choose which snapshots are removed. This is fine
because we have chosen the maximum snapshot table size to be so large
(65536 entries) that it cannot be reasonably reached. If the snapshot
table exceeds this size, the image has probably been corrupted in some
way; in this case, it is most important to just make the image usable
such that the user can copy off at least the active layer.
(Also note that the snapshots will be removed only with "-r all", so a
plain "check" or "check -r leaks" will not delete any data.)
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-14-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
We currently refuse to open qcow2 images with overly long snapshot
tables. This patch makes qemu-img check -r all drop all offending
entries past what we deem acceptable.
The user cannot choose which snapshots are removed. This is fine
because we have chosen the maximum snapshot table size to be so large
(64 MB) that it cannot be reasonably reached. If the snapshot table
exceeds this size, the image has probably been corrupted in some way; in
this case, it is most important to just make the image usable such that
the user can copy off at least the active layer.
(Also note that the snapshots will be removed only with "-r all", so a
plain "check" or "check -r leaks" will not delete any data.)
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-13-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
When repairing the snapshot table, we truncate entries that have too
much extra data. This frees up space that we do not have to count
towards the snapshot table size.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-12-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The only case where we currently reject snapshot table entries is when
they have too much extra data. Fix them with qemu-img check -r all by
counting it as a corruption, reducing their extra_data_size, and then
letting qcow2_check_fix_snapshot_table() do the rest.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-11-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
qcow2_check_read_snapshot_table() can perform consistency checks, but it
cannot fix everything. Specifically, it cannot allocate new clusters,
because that should wait until the refcount structures are known to be
consistent (i.e., after qcow2_check_refcounts()). Thus, it cannot call
qcow2_write_snapshots().
Do that in qcow2_check_fix_snapshot_table(), which is called after
qcow2_check_refcounts().
Currently, there is nothing that would set result->corruptions, so this
is a no-op. A follow-up patch will change that.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-10-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reading the snapshot table can fail. That is a problem when we want to
repair the image.
Therefore, stop reading the snapshot table in qcow2_do_open() in check
mode. Instead, add a new function qcow2_check_read_snapshot_table()
that reads the snapshot table at a later point. In the future, we want
to handle errors here and fix them.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-9-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
qcow2 v3 requires every snapshot table entry to have two extra data
fields: The 64-bit VM state size, and the virtual disk size. Both are
optional for v2 images, so they may not be present.
qcow2_upgrade() therefore should update the snapshot table to ensure all
entries have these extra data fields.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1727347
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-8-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This does not make sense right now, but it will make sense once we need
to do more than to just update s->qcow_version.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-7-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Updating the snapshot list will be useful when upgrading a v2 image to
v3, so we will need to call this function in qcow2.c.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-6-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The qcow2 specification says to ignore unknown extra data fields in
snapshot table entries. Currently, we discard it whenever we update the
image, which is a bit different from "ignore".
This patch makes the qcow2 driver keep all unknown extra data fields
when updating an image's snapshot table.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-5-mreitz@redhat.com
[mreitz: Adjusted comments as proposed by Eric]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-4-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20191011152814.14791-3-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
mirror_exit_common() may be called twice (if it is called from
mirror_prepare() and fails, it will be called from mirror_abort()
again).
In such a case, many of the pointers in the MirrorBlockJob object will
already be freed. This can be seen most reliably for s->target, which
is set to NULL (and then dereferenced by blk_bs()).
Cc: qemu-stable@nongnu.org
Fixes: 737efc1eda
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20191014153931.20699-2-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
No reason to limit buffered copy to one cluster. Let's allow up to 1
MiB.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191022111805.3432-7-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Currently total allocation for parallel requests to block-copy instance
is unlimited. Let's limit it to 128 MiB.
For now block-copy is used only in backup, so actually we limit total
allocation for backup job.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191022111805.3432-6-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Merge copying code into one function block_copy_do_copy, which only
calls bdrv_ io functions and don't do any synchronization (like dirty
bitmap set/reset).
Refactor block_copy() function so that it takes full decision about
size of chunk to be copied and does all the synchronization (checking
intersecting requests, set/reset dirty bitmaps).
It will help:
- introduce parallel processing of block_copy iterations: we need to
calculate chunk size, start async chunk copying and go to the next
iteration
- simplify synchronization improvement (like memory limiting in
further commit and reducing critical section (now we lock the whole
requested range, when actually we need to lock only dirty region
which we handle at the moment))
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191022111805.3432-4-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Large copy range may imply memory allocation and large io effort, so
using 2G copy range request may be bad idea. Let's limit it to 16 MiB.
It also helps the following patch to refactor copy-with-offload
fallback to copy-with-bounce-buffer.
Note, that total memory usage of backup is still not limited, it will
be fixed in further commit.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191022111805.3432-3-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Move bounce_buffer allocation block_copy_with_bounce_buffer. This
commit simplifies further work on implementing copying by larger chunks
(of different size) and further asynchronous handling of block_copy
iterations (with help of block/aio_task API).
Allocation works fast, a lot faster than disk io, so it's not a problem
that we now allocate/free bounce_buffer more times. And we anyway will
have to allocate several bounce_buffers for parallel execution of loop
iterations in future.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191022111805.3432-2-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This reverts commit 9adc1cb49a.
"mirror: Only mirror granularity-aligned chunks"
Since previous commit unaligned chunks are supported by
do_sync_target_write.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191011090711.19940-6-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Prior 9adc1cb49a do_sync_target_write had a bug: it reset aligned-up
region in the dirty bitmap, which means that we may not copy some bytes
and assume them copied, which actually leads to producing corrupted
target.
So 9adc1cb49a forced dirty bitmap granularity to be
request_alignment for mirror-top filter, so we are not working with
unaligned requests. However forcing large alignment obviously decreases
performance of unaligned requests.
This commit provides another solution for the problem: if unaligned
padding is already dirty, we can safely ignore it, as
1. It's dirty, it will be copied by mirror_iteration anyway
2. It's dirty, so skipping it now we don't increase dirtiness of the
bitmap and therefore don't damage "synchronicity" of the
write-blocking mirror.
If unaligned padding is not dirty, we just write it, no reason to touch
dirty bitmap if we succeed (on failure we'll set the whole region
ofcourse, but we loss "synchronicity" on failure anyway).
Note: we need to disable dirty_bitmap, otherwise we will not be able to
see in do_sync_target_write bitmap state before current operation. We
may of course check dirty bitmap before the operation in
bdrv_mirror_top_do_write and remember it, but we don't need active
dirty bitmap for write-blocking mirror anyway.
New code-path is unused until the following commit reverts
9adc1cb49a.
Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20191011090711.19940-5-vsementsov@virtuozzo.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Add blk write function with qiov_offset parameter. It's needed for the
following commit.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191011090711.19940-4-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
do_sync_target_write is called from bdrv_mirror_top_do_write after
write/discard operation, all inside active_write/active_write_settle
protecting us from mirror iteration. So the whole area is dirty for
sure, no reason to examine dirty bitmap.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20191011090711.19940-3-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
qcow2_detect_metadata_preallocation() calls qcow2_get_refcount() which
requires s->lock to be taken to protect its accesses to the refcount
table and refcount blocks. However, nothing in this code path actually
took the lock. This could cause the same cache entry to be used by two
requests at the same time, for different tables at different offsets,
resulting in image corruption.
As it would be preferable to base the detection on consistent data (even
though it's just heuristics), let's take the lock not only around the
qcow2_get_refcount() calls, but around the whole function.
This patch takes the lock in qcow2_co_block_status() earlier and asserts
in qcow2_detect_metadata_preallocation() that we hold the lock.
Fixes: 69f47505ee
Cc: qemu-stable@nongnu.org
Reported-by: Michael Weiser <michael.weiser@gmx.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Michael Weiser <michael.weiser@gmx.de>
Reviewed-by: Michael Weiser <michael.weiser@gmx.de>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
After commit 00e30f05de, there is no more "goto error" points
after job creation, so after "error:" @job is always NULL and we don't
need roll-back job creation.
Reported-by: Coverity (CID 1406402)
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Implement reconnect. To achieve this:
1. add new modes:
connecting-wait: means, that reconnecting is in progress, and there
were small number of reconnect attempts, so all requests are
waiting for the connection.
connecting-nowait: reconnecting is in progress, there were a lot of
attempts of reconnect, all requests will return errors.
two old modes are used too:
connected: normal state
quit: exiting after fatal error or on close
Possible transitions are:
* -> quit
connecting-* -> connected
connecting-wait -> connecting-nowait (transition is done after
reconnect-delay seconds in connecting-wait mode)
connected -> connecting-wait
2. Implement reconnect in connection_co. So, in connecting-* mode,
connection_co, tries to reconnect unlimited times.
3. Retry nbd queries on channel error, if we are in connecting-wait
state.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20191009084158.15614-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The only reason I can imagine for this strange code at the very-end of
bdrv_reopen_commit is the fact that bs->read_only updated after
calling drv->bdrv_reopen_commit in bdrv_reopen_commit. And in the same
time, prior to previous commit, qcow2_reopen_bitmaps_rw did a wrong
check for being writable, when actually it only need writable file
child not self.
So, as it's fixed, let's move things to correct place.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Acked-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190927122355.7344-10-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
- Correct check for write access to file child, and in correct place
(only if we want to write).
- Support reopen rw -> rw (which will be used in following commit),
for example, !bdrv_dirty_bitmap_readonly() is not a corruption if
bitmap is marked IN_USE in the image.
- Consider unexpected bitmap as a corruption and check other
combinations of in-image and in-RAM bitmaps.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190927122355.7344-9-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
qcow2_reopen_bitmaps_ro wants to store bitmaps and then mark them all
readonly. But the latter don't work, as
qcow2_store_persistent_dirty_bitmaps removes bitmaps after storing.
It's OK for inactivation but bad idea for reopen-ro. And this leads to
the following bug:
Assume we have persistent bitmap 'bitmap0'.
Create external snapshot
bitmap0 is stored and therefore removed
Commit snapshot
now we have no bitmaps
Do some writes from guest (*)
they are not marked in bitmap
Shutdown
Start
bitmap0 is loaded as valid, but it is actually broken! It misses
writes (*)
Incremental backup
it will be inconsistent
So, let's stop removing bitmaps on reopen-ro. But don't rejoice:
reopening bitmaps to rw is broken too, so the whole scenario will not
work after this patch and we can't enable corresponding test cases in
260 iotests still. Reopening bitmaps rw will be fixed in the following
patches.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190927122355.7344-7-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
The function is unused, drop it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190927122355.7344-6-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
Firstly, no reason to optimize failure path. Then, function name is
ambiguous: it checks for readonly and similar things, but someone may
think that it will ignore normal bitmaps which was just unchanged, and
this is in bad relation with the fact that we should drop IN_USE flag
for unchanged bitmaps in the image.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190927122355.7344-5-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
bdrv_dirty_bitmap_next is always used in same pattern. So, split it
into _next and _first, instead of combining two functions into one and
add FOR_EACH_DIRTY_BITMAP macro.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190916141911.5255-5-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
mutex field is just a pointer to bs->dirty_bitmap_mutex, so no needs
to store it in BdrvDirtyBitmap when we have bs pointer in it (since
previous patch).
Drop mutex field. Constantly use bdrv_dirty_bitmaps_lock/unlock in
block/dirty-bitmap.c to make it more obvious that it's not per-bitmap
lock. Still, for simplicity, leave bdrv_dirty_bitmap_lock/unlock
functions as an external API.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190916141911.5255-4-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
Add bs field to BdrvDirtyBitmap structure. Drop BlockDriverState
parameter from bitmap APIs where possible.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190916141911.5255-3-vsementsov@virtuozzo.com
[Rebased on top of block-copy. --js]
Signed-off-by: John Snow <jsnow@redhat.com>
Drop meta bitmaps, as they are unused.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190916141911.5255-2-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
qmp_block_dirty_bitmap_add and do_block_dirty_bitmap_remove do acquire
aio context since 0a6c86d024. But this is not enough: we also must
lock qcow2 mutex when access in-image metadata. Especially it concerns
freeing qcow2 clusters.
To achieve this, move qcow2_can_store_new_dirty_bitmap and
qcow2_remove_persistent_dirty_bitmap to coroutine context.
Since we work in coroutines in correct aio context, we don't need
context acquiring in blockdev.c anymore, drop it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190920082543.23444-4-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
It's more comfortable to not deal with local_err.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190920082543.23444-3-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
block/dirty-bitmap.c seems to be more appropriate for it and
bdrv_remove_persistent_dirty_bitmap already in it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20190920082543.23444-2-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
When the COW areas are included, the size of an allocation can exceed
INT_MAX. This is kind of limited by handle_alloc() in that it already
caps avail_bytes at INT_MAX, but the number of clusters still reflects
the original length.
This can have all sorts of effects, ranging from the storage layer write
call failing to image corruption. (If there were no image corruption,
then I suppose there would be data loss because the .cow_end area is
forced to be empty, even though there might be something we need to
COW.)
Fix all of it by limiting nb_clusters so the equivalent number of bytes
will not exceed INT_MAX.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The BDRV_REQ_NO_FALLBACK flag means that an operation should only be
performed if it can be offloaded or otherwise performed efficiently.
However a misaligned write request requires a RMW so we should return
an error and let the caller decide how to proceed.
This hits an assertion since commit c8bb23cbdb if the required
alignment is larger than the cluster size:
qemu-img create -f qcow2 -o cluster_size=2k img.qcow2 4G
qemu-io -c "open -o driver=qcow2,file.align=4k blkdebug::img.qcow2" \
-c 'write 0 512'
qemu-io: block/io.c:1127: bdrv_driver_pwritev: Assertion `!(flags & BDRV_REQ_NO_FALLBACK)' failed.
Aborted
The reason is that when writing to an unallocated cluster we try to
skip the copy-on-write part and zeroize it using BDRV_REQ_NO_FALLBACK
instead, resulting in a write request that is too small (2KB cluster
size vs 4KB required alignment).
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Replay is capable of recording normal BH events, but sometimes
there are single use callbacks scheduled with aio_bh_schedule_oneshot
function. This patch enables recording and replaying such callbacks.
Block layer uses these events for calling the completion function.
Replaying these calls makes the execution deterministic.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In record/replay mode bdrv queue is controlled by replay mechanism.
It does not allow saving or loading the snapshots
when bdrv queue is not empty. Stopping the VM is not blocked by nonempty
queue, but flushing the queue is still impossible there,
because it may cause deadlocks in replay mode.
This patch disables bdrv_drain_all and bdrv_flush_all in
record/replay mode.
Stopping the machine when the IO requests are not finished is needed
for the debugging. E.g., breakpoint may be set at the specified step,
and forcing the IO requests to finish may break the determinism
of the execution.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch enables making snapshots with blkreplay used in
block devices.
This function is required to make bdrv_snapshot_goto without
calling .bdrv_open which is not implemented.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qemu is currently not able to detect truncated vhdx image files.
Add a basic check if all allocated blocks are reachable at open and
report all errors during bdrv_co_check.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>