Commit Graph

5270 Commits

Author SHA1 Message Date
Paolo Bonzini
b02629550d replication: move include out of root directory
The replication.h file is included from migration/colo.c and tests/unit/test-replication.c,
so it should be in include/.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-26 14:49:46 +02:00
Paolo Bonzini
29a6ea24eb coroutine-sleep: replace QemuCoSleepState pointer with struct in the API
Right now, users of qemu_co_sleep_ns_wakeable are simply passing
a pointer to QemuCoSleepState by reference to the function.  But
QemuCoSleepState really is just a Coroutine*; making the
content of the struct public is just as efficient and lets us
skip the user_state_pointer indirection.

Since the usage is changed, take the occasion to rename the
struct to QemuCoSleep.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210517100548.28806-6-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-05-21 18:22:33 +01:00
Paolo Bonzini
eaee072085 coroutine-sleep: allow qemu_co_sleep_wake that wakes nothing
All callers of qemu_co_sleep_wake are checking whether they are passing
a NULL argument inside the pointer-to-pointer: do the check in
qemu_co_sleep_wake itself.

As a side effect, qemu_co_sleep_wake can be called more than once and
it will only wake the coroutine once; after the first time, the argument
will be set to NULL via *sleep_state->user_state_pointer.  However, this
would not be safe unless co_sleep_cb keeps using the QemuCoSleepState*
directly, so make it go through the pointer-to-pointer instead.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210517100548.28806-4-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-05-21 18:22:33 +01:00
Stefan Hajnoczi
1b0b2e6d06 block/export: improve vu_blk_sect_range_ok()
The checks in vu_blk_sect_range_ok() assume VIRTIO_BLK_SECTOR_SIZE is
equal to BDRV_SECTOR_SIZE. This is true, but let's add a
QEMU_BUILD_BUG_ON() to make it explicit.

We might as well check that the request buffer size is a multiple of
VIRTIO_BLK_SECTOR_SIZE while we're at it.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210331142727.391465-1-stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-05-18 11:08:13 +02:00
Vladimir Sementsov-Ogievskiy
38b4409647 qcow2: set bdi->is_dirty
Set bdi->is_dirty, so that qemu-img info could show dirty flag.

After this commit the following check will show '"dirty-flag": true':

./build/qemu-img create -f qcow2 -o lazy_refcounts=on x 1M
./build/qemu-io x
qemu-io> write 0 1M

 After "write" command success, kill the qemu-io process:

kill -9 <qemu-io pid>

./build/qemu-img info --output=json x

This will show '"dirty-flag": true' among other things. (before this
commit it shows '"dirty-flag": false')

Note, that qcow2's dirty-bit is not a "dirty bit for the image". It
only protects qcow2 lazy refcounts feature. So, there are a lot of
conditions when qcow2 session may be not closed correctly, but bit is
0. Still, when bit is set, the last session is definitely not finished
correctly and it's better to report it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210504160656.462836-1-vsementsov@virtuozzo.com>
Tested-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-05-18 11:08:13 +02:00
Vladimir Sementsov-Ogievskiy
c61ebf362d write-threshold: deal with includes
"qemu/typedefs.h" is enough for include/block/write-threshold.h header
with forward declaration of BlockDriverState. Also drop extra includes
from block/write-threshold.c and tests/unit/test-write-threshold.c

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210506090621.11848-9-vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-05-14 16:14:10 +02:00
Vladimir Sementsov-Ogievskiy
2e0e9cbd89 block/write-threshold: drop extra APIs
bdrv_write_threshold_exceeded() is unused.

bdrv_write_threshold_is_set() is used only to double check the value of
bs->write_threshold_offset in tests. No real sense in it (both tests do
check real value with help of bdrv_write_threshold_get())

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210506090621.11848-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[mreitz: Adjusted commit message as per Eric's suggestion]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-05-14 16:14:10 +02:00
Vladimir Sementsov-Ogievskiy
ad578c56d5 block: drop write notifiers
They are unused now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210506090621.11848-3-vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-05-14 16:14:10 +02:00
Vladimir Sementsov-Ogievskiy
94783301b8 block/write-threshold: don't use write notifiers
write-notifiers are used only for write-threshold. New code for such
purpose should create filters.

Let's better special-case write-threshold and drop write notifiers at
all. (Actually, write-threshold is special-cased anyway, as the only
user of write-notifiers)

So, create a new direct interface for bdrv_co_write_req_prepare() and
drop all write-notifier related logic from write-threshold.c.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210506090621.11848-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[mreitz: Adjusted comment as per Eric's suggestion]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-05-14 16:14:10 +02:00
Vladimir Sementsov-Ogievskiy
bcc8584c83 block/copy-on-read: use bdrv_drop_filter() and drop s->active
Now, after huge update of block graph permission update algorithm, we
don't need this workaround with active state of the filter. Drop it and
use new smart bdrv_drop_filter() function.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210506194143.394141-1-vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-05-14 16:14:10 +02:00
Vladimir Sementsov-Ogievskiy
9c785cd714 mirror: stop cancelling in-flight requests on non-force cancel in READY
If mirror is READY than cancel operation is not discarding the whole
result of the operation, but instead it's a documented way get a
point-in-time snapshot of source disk.

So, we should not cancel any requests if mirror is READ and
force=false. Let's fix that case.

Note, that bug that we have before this commit is not critical, as the
only .bdrv_cancel_in_flight implementation is nbd_cancel_in_flight()
and it cancels only requests waiting for reconnection, so it should be
rare case.

Fixes: 521ff8b779
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210421075858.40197-1-vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-05-14 16:14:10 +02:00
Vladimir Sementsov-Ogievskiy
78632a3d16 monitor: hmp_qemu_io: acquire aio contex, fix crash
Max reported the following bug:

$ ./qemu-img create -f raw src.img 1G
$ ./qemu-img create -f raw dst.img 1G

$ (echo '
   {"execute":"qmp_capabilities"}
   {"execute":"blockdev-mirror",
    "arguments":{"job-id":"mirror",
                 "device":"source",
                 "target":"target",
                 "sync":"full",
                 "filter-node-name":"mirror-top"}}
'; sleep 3; echo '
   {"execute":"human-monitor-command",
    "arguments":{"command-line":
                 "qemu-io mirror-top \"write 0 1G\""}}') \
| x86_64-softmmu/qemu-system-x86_64 \
   -qmp stdio \
   -blockdev file,node-name=source,filename=src.img \
   -blockdev file,node-name=target,filename=dst.img \
   -object iothread,id=iothr0 \
   -device virtio-blk,drive=source,iothread=iothr0

crashes:

0  raise () at /usr/lib/libc.so.6
1  abort () at /usr/lib/libc.so.6
2  error_exit
   (err=<optimized out>,
   msg=msg@entry=0x55fbb1634790 <__func__.27> "qemu_mutex_unlock_impl")
   at ../util/qemu-thread-posix.c:37
3  qemu_mutex_unlock_impl
   (mutex=mutex@entry=0x55fbb25ab6e0,
   file=file@entry=0x55fbb1636957 "../util/async.c",
   line=line@entry=650)
   at ../util/qemu-thread-posix.c:109
4  aio_context_release (ctx=ctx@entry=0x55fbb25ab680) at ../util/async.c:650
5  bdrv_do_drained_begin
   (bs=bs@entry=0x55fbb3a87000, recursive=recursive@entry=false,
   parent=parent@entry=0x0,
   ignore_bds_parents=ignore_bds_parents@entry=false,
   poll=poll@entry=true) at ../block/io.c:441
6  bdrv_do_drained_begin
   (poll=true, ignore_bds_parents=false, parent=0x0, recursive=false,
   bs=0x55fbb3a87000) at ../block/io.c:448
7  blk_drain (blk=0x55fbb26c5a00) at ../block/block-backend.c:1718
8  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:498
9  blk_unref (blk=0x55fbb26c5a00) at ../block/block-backend.c:491
10 hmp_qemu_io (mon=0x7fffaf3fc7d0, qdict=<optimized out>)
   at ../block/monitor/block-hmp-cmds.c:628

man pthread_mutex_unlock
...
    EPERM  The  mutex type is PTHREAD_MUTEX_ERRORCHECK or
    PTHREAD_MUTEX_RECURSIVE, or the mutex is a robust mutex, and the
    current thread does not own the mutex.

So, thread doesn't own the mutex. And we have iothread here.

Next, note that AIO_WAIT_WHILE() documents that ctx must be acquired
exactly once by caller. But where is it acquired in the call stack?
Seems nowhere.

qemuio_command do acquire aio context.. But we need context acquired
around blk_unref() as well and actually around blk_insert_bs() too.

Let's refactor qemuio_command so that it doesn't acquire aio context
but callers do that instead. This way we can cleanly acquire aio
context in hmp_qemu_io() around all three calls.

Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210423134233.51495-1-vsementsov@virtuozzo.com>
[mreitz: Fixed comment]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-05-14 16:14:10 +02:00
Connor Kuehl
2b99cfce08 block/rbd: Add an escape-aware strchr helper
Sometimes the parser needs to further split a token it has collected
from the token input stream. Right now, it does a cursory check to see
if the relevant characters appear in the token to determine if it should
break it down further.

However, qemu_rbd_next_tok() will escape characters as it removes tokens
from the token stream and plain strchr() won't. This can make the
initial strchr() check slightly misleading since it implies
qemu_rbd_next_tok() will find the token and split on it, except the
reality is that qemu_rbd_next_tok() will pass over it if it is escaped.

Use a custom strchr to avoid mixing escaped and unescaped string
operations. Furthermore, this code is identical to how
qemu_rbd_next_tok() seeks its next token, so incorporate this custom
strchr into the body of that function to reduce duplication.

Reported-by: Han Han <hhan@redhat.com>
Fixes: https://bugzilla.redhat.com/1873913
Signed-off-by: Connor Kuehl <ckuehl@redhat.com>
Message-Id: <20210421212343.85524-3-ckuehl@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-05-14 16:14:10 +02:00
Markus Armbruster
09ec85176e block: Drop the sheepdog block driver
It was deprecated in commit e1c4269763, v5.2.0.  See that commit
message for rationale.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210501075747.3293186-1-armbru@redhat.com>
ACKed-by: Peter Krempa <pkrempa@redhat.com>
2021-05-12 17:42:23 +02:00
Peter Maydell
4cc10cae64 * NetBSD NVMM support
* RateLimit mutex
 * Prepare for Meson 0.57 upgrade
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmCROukUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOFXgf/ThwuBCbwC6pwoHpZzFXHdJRXIqHa
 iKTqjCLymz9NQBRTaMeG5CWjXl4o9syHLzEXLQxuQaynHK8AjbyeMSllBVLzBUme
 TU9AY3qwLShRJm3XGXkuUilFE+IR8FXWFgrTOsZXgbT+JQlkCgiuhCRqfAcDEgi/
 F5SNqlMzPNvF6G0FY9DFBBkoKF4YWROx25SgNl3fxgWwC94px/a22BXTVpOxaClZ
 HE/H+kbJH5sD2dOJR5cqbgFg7eBemNdxO3tSbR6WoP9pcvVPx0Dgh5hUJb5+pUXY
 fV5O5zZ+CdyNjWM4yAHg0y8kOlnqrLwv7pH+NdqWFaWiZ9uCSrVFR13ejQ==
 =sKO4
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging

* NetBSD NVMM support
* RateLimit mutex
* Prepare for Meson 0.57 upgrade

# gpg: Signature made Tue 04 May 2021 13:15:37 BST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini-gitlab/tags/for-upstream:
  glib-compat: accept G_TEST_SLOW environment variable
  gitlab-ci: use --meson=internal for CFI jobs
  configure: handle meson options that have changed type
  configure: reindent meson invocation
  slirp: add configure option to disable smbd
  ratelimit: protect with a mutex
  Add NVMM Accelerator: add maintainers for NetBSD/NVMM
  Add NVMM accelerator: acceleration enlightenments
  Add NVMM accelerator: x86 CPU support
  Add NVMM accelerator: configure and build logic
  oslib-win32: do not rely on macro to get redefined function name

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-06 18:56:17 +01:00
Paolo Bonzini
4951967d84 ratelimit: protect with a mutex
Right now, rate limiting is protected by the AioContext mutex, which is
taken for example both by the block jobs and by qmp_block_job_set_speed
(via find_block_job).

We would like to remove the dependency of block layer code on the
AioContext mutex, since most drivers and the core I/O code are already
not relying on it.  However, there is no existing lock that can easily
be taken by both ratelimit_set_speed and ratelimit_calculate_delay,
especially because the latter might run in coroutine context (and
therefore under a CoMutex) but the former will not.

Since concurrent calls to ratelimit_calculate_delay are not possible,
one idea could be to use a seqlock to get a snapshot of slice_ns and
slice_quota.  But for now keep it simple, and just add a mutex to the
RateLimit struct; block jobs are generally not performance critical to
the point of optimizing the clock cycles spent in synchronization.

This also requires the introduction of init/destroy functions, so
add them to the two users of ratelimit.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-04 14:15:35 +02:00
Thomas Huth
4c386f8064 Do not include sysemu/sysemu.h if it's not really necessary
Stop including sysemu/sysemu.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-2-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:50 +02:00
Kevin Wolf
35b7f4abd5 block: Add BDRV_O_NO_SHARE for blk_new_open()
Normally, blk_new_open() just shares all permissions. This was fine
originally when permissions only protected against uses in the same
process because no other part of the code would actually get to access
the block nodes opened with blk_new_open(). However, since we use it for
file locking now, unsharing permissions becomes desirable.

Add a new BDRV_O_NO_SHARE flag that is used in blk_new_open() to unshare
any permissions that can be unshared.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210422164344.283389-2-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
72373e40fb block: bdrv_reopen_multiple: refresh permissions on updated graph
Move bdrv_reopen_multiple to new paradigm of permission update:
first update graph relations, then do refresh the permissions.

We have to modify reopen process in file-posix driver: with new scheme
we don't have prepared permissions in raw_reopen_prepare(), so we
should reconfigure fd in raw_check_perm(). Still this seems more native
and simple anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-31-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
1e4c797c75 block: make bdrv_refresh_limits() to be a transaction action
To be used in further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-28-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
b75d64b329 block/backup-top: drop .active
We don't need this workaround anymore: bdrv_append is already smart
enough and we can use new bdrv_drop_filter().

This commit efficiently reverts also recent 705dde27c6, which
checked .active on io path. Still it said that the problem should be
theoretical. And the logic of filter removement is changed anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-25-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
228ca37e12 block: drop ctx argument from bdrv_root_attach_child
Passing parent aio context is redundant, as child_class and parent
opaque pointer are enough to retrieve it. Drop the argument and use new
bdrv_child_get_parent_aio_context() interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-7-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:47 +02:00
Vladimir Sementsov-Ogievskiy
3ca1f32257 block: BdrvChildClass: add .get_parent_aio_context handler
Add new handler to get aio context and implement it in all child
classes. Add corresponding public interface to be used soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-6-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:47 +02:00
Vladimir Sementsov-Ogievskiy
ae9d441706 block: bdrv_append(): don't consume reference
We have too much comments for this feature. It seems better just don't
do it. Most of real users (tests don't count) have to create additional
reference.

Drop also comment in external_snapshot_prepare:
 - bdrv_append doesn't "remove" old bs in common sense, it sounds
   strange
 - the fact that bdrv_append can fail is obvious from the context
 - the fact that we must rollback all changes in transaction abort is
   known (it's the direct role of abort)

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-5-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:47 +02:00
Vladimir Sementsov-Ogievskiy
0267101af6 block/nbd: fix possible use after free of s->connect_thread
If on nbd_close() we detach the thread (in
nbd_co_establish_connection_cancel() thr->state becomes
CONNECT_THREAD_RUNNING_DETACHED), after that point we should not use
s->connect_thread (which is set to NULL), as running thread may free it
at any time.

Still nbd_co_establish_connection() does exactly this: it saves
s->connect_thread to local variable (just for better code style) and
use it even after yield point, when thread may be already detached.

Fix that. Also check thr to be non-NULL on
nbd_co_establish_connection() start for safety.

After this patch "case CONNECT_THREAD_RUNNING_DETACHED" becomes
impossible in the second switch in nbd_co_establish_connection().
Still, don't add extra abort() just before the release. If it somehow
possible to reach this "case:" it won't hurt. Anyway, good refactoring
of all this reconnect mess will come soon.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210406155114.1057355-1-vsementsov@virtuozzo.com>
Reviewed-by: Roman Kagan <rvkagan@yandex-team.ru>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-04-13 15:35:12 +02:00
Max Reitz
00769414cd mirror: Do not enter a paused job on completion
Currently, it is impossible to complete jobs on standby (i.e. paused
ready jobs), but actually the only thing in mirror_complete() that does
not work quite well with a paused job is the job_enter() at the end.

If we make it conditional, this function works just fine even if the
mirror job is paused.

So technically this is a no-op, but obviously the intention is to accept
block-job-complete even for jobs on standby, which we need this patch
for first.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210409120422.144040-3-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-09 18:00:29 +02:00
Max Reitz
c41f5b96ee mirror: Move open_backing_file to exit_common
This is a graph change and therefore should be done in job-finalize
(which is what invokes mirror_exit_common()).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210409120422.144040-2-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-09 18:00:29 +02:00
Stefano Garzarella
b084b420d9 block/rbd: fix memory leak in qemu_rbd_co_create_opts()
When we allocate 'q_namespace', we forgot to set 'has_q_namespace'
to true. This can cause several issues, including a memory leak,
since qapi_free_BlockdevCreateOptions() does not deallocate that
memory, as reported by valgrind:

  13 bytes in 1 blocks are definitely lost in loss record 7 of 96
     at 0x4839809: malloc (vg_replace_malloc.c:307)
     by 0x48CEBB8: g_malloc (in /usr/lib64/libglib-2.0.so.0.6600.8)
     by 0x48E3FE3: g_strdup (in /usr/lib64/libglib-2.0.so.0.6600.8)
     by 0x180010: qemu_rbd_co_create_opts (rbd.c:446)
     by 0x1AE72C: bdrv_create_co_entry (block.c:492)
     by 0x241902: coroutine_trampoline (coroutine-ucontext.c:173)
     by 0x57530AF: ??? (in /usr/lib64/libc-2.32.so)
     by 0x1FFEFFFA6F: ???

Fix setting 'has_q_namespace' to true when we allocate 'q_namespace'.

Fixes: 19ae9ae014 ("block/rbd: Add support for ceph namespaces")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210329150129.121182-3-sgarzare@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-09 18:00:29 +02:00
Stefano Garzarella
c1c1f6cf51 block/rbd: fix memory leak in qemu_rbd_connect()
In qemu_rbd_connect(), 'mon_host' is allocated by qemu_rbd_mon_host()
using g_strjoinv(), but it's only freed in the error path, leaking
memory in the success path as reported by valgrind:

  80 bytes in 4 blocks are definitely lost in loss record 5,028 of 6,516
     at 0x4839809: malloc (vg_replace_malloc.c:307)
     by 0x5315BB8: g_malloc (in /usr/lib64/libglib-2.0.so.0.6600.8)
     by 0x532B6FF: g_strjoinv (in /usr/lib64/libglib-2.0.so.0.6600.8)
     by 0x87D07E: qemu_rbd_mon_host (rbd.c:538)
     by 0x87D07E: qemu_rbd_connect (rbd.c:562)
     by 0x87E1CE: qemu_rbd_open (rbd.c:740)
     by 0x840EB1: bdrv_open_driver (block.c:1528)
     by 0x8453A9: bdrv_open_common (block.c:1802)
     by 0x8453A9: bdrv_open_inherit (block.c:3444)
     by 0x8464C2: bdrv_open (block.c:3537)
     by 0x8108CD: qmp_blockdev_add (blockdev.c:3569)
     by 0x8EA61B: qmp_marshal_blockdev_add (qapi-commands-block-core.c:1086)
     by 0x90B528: do_qmp_dispatch_bh (qmp-dispatch.c:131)
     by 0x907EA4: aio_bh_poll (async.c:164)

Fix freeing 'mon_host' also when qemu_rbd_connect() ends correctly.

Fixes: 0a55679b4a
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210329150129.121182-2-sgarzare@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-09 18:00:29 +02:00
David Edmondson
07ee2ab4fd block/vdi: Don't assume that blocks are larger than VdiHeader
Given that the block size is read from the header of the VDI file, a
wide variety of sizes might be seen. Rather than re-using a block
sized memory region when writing the VDI header, allocate an
appropriately sized buffer.

Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Max Reitz <mreitz@redhat.com>
Message-id: 20210325112941.365238-3-pbonzini@redhat.com
Message-Id: <20210309144015.557477-3-david.edmondson@oracle.com>
Acked-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-31 10:44:21 +01:00
David Edmondson
574b8304cf block/vdi: When writing new bmap entry fails, don't leak the buffer
If a new bitmap entry is allocated, requiring the entire block to be
written, avoiding leaking the buffer allocated for the block should
the write fail.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Max Reitz <mreitz@redhat.com>
Message-id: 20210325112941.365238-2-pbonzini@redhat.com
Message-Id: <20210309144015.557477-2-david.edmondson@oracle.com>
Acked-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-03-31 10:44:21 +01:00
Max Reitz
484108293d qcow2: Force preallocation with data-file-raw
Setting the qcow2 data-file-raw bit means that you can ignore the
qcow2 metadata when reading from the external data file.  It does not
mean that you have to ignore it, though.  Therefore, the data read must
be the same regardless of whether you interpret the metadata or whether
you ignore it, and thus the L1/L2 tables must all be present and give a
1:1 mapping.

This patch changes 244's output: First, the qcow2 file is larger right
after creation, because of metadata preallocation.  Second, the qemu-img
map output changes: Everything that was not explicitly discarded or
zeroed is now a data area.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210326145509.163455-2-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2021-03-30 13:02:10 +02:00
Max Reitz
53431b9086 block/mirror: Fix mirror_top's permissions
mirror_top currently shares all permissions, and takes only the WRITE
permission (if some parent has taken that permission, too).

That is wrong, though; mirror_top is a filter, so it should take
permissions like any other filter does.  For example, if the parent
needs CONSISTENT_READ, we need to take that, too, and if it cannot share
the WRITE permission, we cannot share it either.

The exception is when mirror_top is used for active commit, where we
cannot take CONSISTENT_READ (because it is deliberately unshared above
the base node) and where we must share WRITE (so that it is shared for
all images in the backing chain, so the mirror job can take it for the
target BB).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210211172242.146671-2-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-03-29 18:09:00 +02:00
Pavel Dovgalyuk
ad0ce64279 qcow2: use external virtual timers
Regular virtual timers are used to emulate timings
related to vCPU and peripheral states. QCOW2 uses timers
to clean the cache. These timers should have external
flag. In the opposite case they affect the execution
and it can't be recorded and replayed.
This patch adds external flag to the timer for qcow2
cache clean.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <161700516327.1141158.8366564693714562536.stgit@pasha-ThinkPad-X280>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-03-29 18:04:29 +02:00
Markus Armbruster
bdabafc683 block: Remove monitor command block_passwd
Command block_passwd always fails since

Commit c01c214b69 "block: remove all encryption handling APIs"
(v2.10.0) turned block_passwd into a stub that always fails, and
hardcoded encryption_key_missing to false in query-named-block-nodes
and query-block.

Commit ad1324e044 "block: remove 'encryption_key_missing' flag from
QAPI" just landed.  Complete the cleanup job: remove block_passwd.

Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210323101951.3686029-1-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-03-23 22:31:56 +01:00
Stefan Hajnoczi
6f4b1996b4 block/export: disable VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD for now
The vhost-user in-flight shmfd feature has not been tested with
qemu-storage-daemon's vhost-user-blk server. Disable this optional
feature for now because it requires MFD_ALLOW_SEALING, which is not
available in some CI environments.

If we need this feature in the future it can be re-enabled after
testing.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210309094106.196911-2-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-19 10:15:06 +01:00
Max Reitz
0f418a2076 curl: Disconnect sockets from CURLState
When a curl transfer is finished, that does not mean that CURL lets go
of all the sockets it used for it.  We therefore must not free a
CURLSocket object before CURL has invoked curl_sock_cb() to tell us to
remove it.  Otherwise, we may get a use-after-free, as described in this
bug report: https://bugs.launchpad.net/qemu/+bug/1916501

(Reproducer from that report:
  $ qemu-img convert -f qcow2 -O raw \
  https://download.cirros-cloud.net/0.4.0/cirros-0.4.0-x86_64-disk.img \
  out.img
)

(Alternatively, it might seem logical to force-drop all sockets that
have been used for a state when the respective transfer is done, kind of
like it is done now, but including unsetting the AIO handlers.
Unfortunately, doing so makes the driver just hang instead of crashing,
which seems to evidence that CURL still uses those sockets.)

Make the CURLSocket object independent of "its" CURLState by putting all
sockets into a hash table belonging to the BDRVCURLState instead of a
list that belongs to a CURLState.  Do not touch any sockets in
curl_clean_state().

Testing, it seems like all sockets are indeed gone by the time the curl
BDS is closed, so it seems like there really was no point in freeing any
socket just because a transfer is done.  libcurl does invoke
curl_sock_cb() with CURL_POLL_REMOVE for every socket it has.

Buglink: https://bugs.launchpad.net/qemu/+bug/1916501
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210309130541.37540-3-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-19 10:15:06 +01:00
Max Reitz
3663dca461 curl: Store BDRVCURLState pointer in CURLSocket
A socket does not really belong to any specific state.  We do not need
to store a pointer to "its" state in it, a pointer to the common
BDRVCURLState is sufficient.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210309130541.37540-2-mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-19 10:15:06 +01:00
Kevin Wolf
1bf26076d6 stream: Don't crash when node permission is denied
The image streaming block job restricts shared permissions of the nodes
it accesses. This can obviously fail when other users already got these
permissions. &error_abort is therefore wrong and can crash. Handle these
errors gracefully and just fail starting the block job.

Reported-by: Nini Gu <ngu@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210309173451.45152-1-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-19 10:15:06 +01:00
Daniel P. Berrangé
8d17adf34f block: remove support for using "file" driver with block/char devices
The 'host_device' and 'host_cdrom' drivers must be used instead.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-03-18 09:22:55 +00:00
Daniel P. Berrangé
e67d8e2928 block: remove 'dirty-bitmaps' field from 'BlockInfo' struct
The same data is available in the 'BlockDeviceInfo' struct.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-03-18 09:22:55 +00:00
Daniel P. Berrangé
81cbfd5088 block: remove dirty bitmaps 'status' field
The same information is available via the 'recording' and 'busy' fields.

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-03-18 09:22:55 +00:00
Daniel P. Berrangé
ad1324e044 block: remove 'encryption_key_missing' flag from QAPI
This has been hardcoded to "false" since 2.10.0, since secrets required
to unlock block devices are now always provided up front instead of using
interactive prompts.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-03-18 09:22:55 +00:00
Peter Maydell
6f34661b6c Pull request
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmBJQHkSHGxhdXJlbnRA
 dml2aWVyLmV1AAoJEPMMOL0/L748EdsP/2U2CGTM95tjDunTs9uZV/7zM6PWt85M
 vAPItNVU2jYPfzmaJN8twrzlj0PEDhvB9Q+OJjE4HEGxEbPcdblLg/R6Zs/EaWuY
 N6oKHPXnOnHb+e80UUJdiAq+Y5RUnJbb5L3ArycnVzBgws+Oj3DtqjB2VDccY4C/
 Gkt23tZ7ikU4958e5VBqW2NUUrr+BQO0mqsW+sbbeE3WPj75NQc6srvS3TWvsg7W
 OYEyVYwm52/q2W/1a3Knfv/YO6UU9NGMpGyDLD2kwQwKbgUWYLW2BiWVwOAUldo9
 De3nfKbKnFezLCZAZro20lfCa/aKwNGCOXWzlrKxqUQCmGYUx7gM1+3ahrSd5N0v
 zUgLdZm7O428ZHL6GujWGLA1UwwzpM9X3P3yo4c0S1J6fHypbI6a9jtewrUFvFgP
 TuQ7dp6cn2DTBYUcsrWilPHbTZMADYQNRD/xUtKqalYBEWy3FX5W75+OYBJKKh+X
 Qip68m6JBzgkszXhCcu6xlLb8ynZJr2VsHvtvIgf4NnLqNOIEgVLcMtoMZT8DPrp
 rIoRc5oUFz8zj5lHnJuLADBUvlCMqoCCoU3h2aqHwH8a7RGb180f+82BW9aBcb2u
 Jk+WgAhBUjWBBC97ReFgrINUD/qZRXVoOq8LthTuQSSyr/i1zq+oLM1F0EDXcMDm
 ssATku2IxL24
 =moUF
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/trivial-branch-for-6.0-pull-request' into staging

Pull request

# gpg: Signature made Wed 10 Mar 2021 21:56:09 GMT
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/trivial-branch-for-6.0-pull-request: (22 commits)
  sysemu: Let VMChangeStateHandler take boolean 'running' argument
  sysemu/runstate: Let runstate_is_running() return bool
  hw/lm32/Kconfig: Have MILKYMIST select LM32_DEVICES
  hw/lm32/Kconfig: Rename CONFIG_LM32 -> CONFIG_LM32_DEVICES
  hw/lm32/Kconfig: Introduce CONFIG_LM32_EVR for lm32-evr/uclinux boards
  qemu-common.h: Update copyright string to 2021
  tests/fp/fp-test: Replace the word 'blacklist'
  qemu-options: Replace the word 'blacklist'
  seccomp: Replace the word 'blacklist'
  scripts/tracetool: Replace the word 'whitelist'
  ui: Replace the word 'whitelist'
  virtio-gpu: Adjust code space style
  exec/memory: Use struct Object typedef
  fuzz-test: remove unneccessary debugging flags
  net: Use id_generate() in the network subsystem, too
  MAINTAINERS: Fix the location of tools manuals
  vhost_user_gpu: Drop dead check for g_malloc() failure
  backends/dbus-vmstate: Fix short read error handling
  target/hexagon/gen_tcg_funcs: Fix a typo
  hw/elf_ops: Fix a typo
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-11 18:55:27 +00:00
Peter Maydell
9abda42bf2 nbd patches for 2021-03-09
- Add Vladimir as NBD co-maintainer
 - Fix reporting of holes in NBD_CMD_BLOCK_STATUS
 - Improve command-line parsing accuracy of large numbers (anything going
 through qemu_strtosz), including the deprecation of hex+suffix
 - Improve some error reporting in the block layer
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAmBHlmIACgkQp6FrSiUn
 Q2q2cQgAqJWNb4J/ShjvzocDDPzJ0iBitFbg0huFPfbt4DScubEZo5wBJG7vOhOW
 hIHrWCRzGvRgsn0tcSfrgFaegmHKrLgjkibM7ou8ni9NC1kUBd3R/3FBNIMxhYf7
 Q8Kfspl0LRfMJDKF9jdCnQ4Gxcd6h2OIYZqiWVg8V4Tc8WdCpIVOah7e7wjuW8bT
 vgZvfboUWm5AmIF9j/MxuMn+HFZ4ArSuFVL80ZaXlD00vRra7u3HZ8pUfcOlOujg
 7HeouM1E5j3NNE6aZSN++x/EQ3sg0zmirbWUCcgAyRfdRkAmB15uh2PUzPxEIJKH
 UHUIW5LvNtz2+yzOAz2yK29OE523Yg==
 =blE1
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2021-03-09' into staging

nbd patches for 2021-03-09

- Add Vladimir as NBD co-maintainer
- Fix reporting of holes in NBD_CMD_BLOCK_STATUS
- Improve command-line parsing accuracy of large numbers (anything going
through qemu_strtosz), including the deprecation of hex+suffix
- Improve some error reporting in the block layer

# gpg: Signature made Tue 09 Mar 2021 15:38:10 GMT
# gpg:                using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full]
# gpg:                 aka "[jpeg image of size 6874]" [full]
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-nbd-2021-03-09:
  block/qcow2: refactor qcow2_update_options_prepare error paths
  block/qed: bdrv_qed_do_open: deal with errp
  block/qcow2: simplify qcow2_co_invalidate_cache()
  block/qcow2: read_cache_sizes: return status value
  block/qcow2-bitmap: return status from qcow2_store_persistent_dirty_bitmaps
  block/qcow2-bitmap: improve qcow2_load_dirty_bitmaps() interface
  block/qcow2: qcow2_get_specific_info(): drop error propagation
  blockjob: return status from block_job_set_speed()
  block/mirror: drop extra error propagation in commit_active_start()
  block: drop extra error propagation for bdrv_set_backing_hd
  blockdev: fix drive_backup_prepare() missed error
  block: check return value of bdrv_open_child and drop error propagation
  utils: Deprecate hex-with-suffix sizes
  utils: Improve qemu_strtosz() to have 64 bits of precision
  utils: Enhance testsuite for do_strtosz()
  nbd: server: Report holes for raw images
  MAINTAINERS: add Vladimir as co-maintainer of NBD

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-11 13:57:08 +00:00
Philippe Mathieu-Daudé
538f049704 sysemu: Let VMChangeStateHandler take boolean 'running' argument
The 'running' argument from VMChangeStateHandler does not require
other value than 0 / 1. Make it a plain boolean.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210111152020.1422021-3-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-03-09 23:13:57 +01:00
Peter Maydell
a557b00469 Block layer patches:
- qemu-storage-daemon: add --pidfile option
 - qemu-storage-daemon: CLI error messages include the option name now
 - vhost-user-blk export: Misc fixes
 - docs: Improvements for qemu-storage-daemon documentation
 - parallels: load bitmap extension
 - backup-top: Don't crash on post-finalize accesses
 - Improve error messages related to node-name options
 - iotests improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmBGWHURHGt3b2xmQHJl
 ZGhhdC5jb20ACgkQfwmycsiPL9ZpyxAAk0gRiayMUidSzgvzU/CeUhzBsC4ayEkn
 dLtTZ8hl7cW/w3GjDK1Wri4MANRN/0YHjiLSzO38lfVpK0z8SJr5aU4CwhRlOKVm
 VWgx+OLlV4Azht9fMNF4SwUXgXhl7pUNiFMNnomb++gvqhjMCedDZcWlnVKhbuQ+
 O3TKGO4tToSGaXP85jCM4xukw5HZ//4QMYg6MH0gDk8ahfE2MhyTHz64oDp412os
 qhxvc4bU2S5xGLaBfLGhsc6VPQFKjblG704P/Y73zeoxq12A0L2Ru98WvrNaXw7Z
 m54jJUINiDkJ7ZOl6W04zdeiLvs3BOrNe+7mxawOTmdkBsLOKErrhrTO1gJmHHmX
 kJLWEh9VYWxVbvE7C3KQt9bclR6wt+aPup4X1XE8pHtocPVONVq5bvctrVgxgK0b
 btN06NcK+2jQxcQkG4MnBJ8S41qmxHyIEQlQWKyUWXvKt6zsFU/NuWKMQrAfYZZi
 5J+RPU/fB073LY4lpAgou0OP1/RIvQmi5zWzjWm/Qbp3JpgC+azcYvxn7UU7J71P
 +u8IEQ4+Q9s0gvXQAh/U8AQg2eOqAwEAyFUJl9wpPN56O03dbI8KyCV5ECIRJu49
 CC8uKlJxZkbw9ZBs11SAmm/0J64WcNb2AMWxDPC8Z6oQbVaRRRznoRwRP2H6odUu
 uBolS43+5cI=
 =eAjH
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches:

- qemu-storage-daemon: add --pidfile option
- qemu-storage-daemon: CLI error messages include the option name now
- vhost-user-blk export: Misc fixes
- docs: Improvements for qemu-storage-daemon documentation
- parallels: load bitmap extension
- backup-top: Don't crash on post-finalize accesses
- Improve error messages related to node-name options
- iotests improvements

# gpg: Signature made Mon 08 Mar 2021 17:01:41 GMT
# gpg:                using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg:                issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (30 commits)
  blockdev: Clarify error messages pertaining to 'node-name'
  block: Clarify error messages pertaining to 'node-name'
  docs: qsd: Explain --export nbd,name=... default
  MAINTAINERS: update parallels block driver
  iotests: add parallels-read-bitmap test
  iotests.py: add unarchive_sample_image() helper
  parallels: support bitmap extension for read-only mode
  block/parallels: BDRVParallelsState: add cluster_size field
  parallels.txt: fix bitmap L1 table description
  qcow2-bitmap: make bytes_covered_by_bitmap_cluster() public
  block/export: port virtio-blk read/write range check
  block/export: port virtio-blk discard/write zeroes input validation
  block/export: fix vhost-user-blk export sector number calculation
  block/export: use VIRTIO_BLK_SECTOR_BITS
  block/export: fix blk_size double byteswap
  libqtest: add qtest_remove_abrt_handler()
  libqtest: add qtest_kill_qemu()
  libqtest: add qtest_socket_server()
  vhost-user-blk: fix blkcfg->num_queues endianness
  docs: replace insecure /tmp examples in qsd docs
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-09 21:31:18 +00:00
Vladimir Sementsov-Ogievskiy
1184b41101 block/qcow2: refactor qcow2_update_options_prepare error paths
Keep setting ret close to setting errp and don't merge different error
paths into one. This way it's more obvious that we don't return
error without setting errp.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20210202124956.63146-15-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-03-08 16:04:46 -06:00
Vladimir Sementsov-Ogievskiy
15ce94a68c block/qed: bdrv_qed_do_open: deal with errp
Always set errp on failure. The generic bdrv_open_driver supports
driver functions which can return a negative value but forget to set
errp.  That's a strange thing. Let's improve bdrv_qed_do_open to not
behave this way. This allows the simplification of code in
bdrv_qed_co_invalidate_cache().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210202124956.63146-14-vsementsov@virtuozzo.com>
[eblake: commit message grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-03-08 16:03:32 -06:00
Vladimir Sementsov-Ogievskiy
e6247c9c9f block/qcow2: simplify qcow2_co_invalidate_cache()
qcow2_do_open correctly sets errp on each failure path. So, we can
simplify code in qcow2_co_invalidate_cache() and drop explicit error
propagation.

Add ERRP_GUARD() as mandated by the documentation in
include/qapi/error.h so that error_prepend() is actually called even if
errp is &error_fatal.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210202124956.63146-13-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-03-08 16:03:27 -06:00
Vladimir Sementsov-Ogievskiy
772c4cad13 block/qcow2: read_cache_sizes: return status value
It's better to return status together with setting errp. It allows to
reduce error propagation.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20210202124956.63146-12-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-03-08 16:03:23 -06:00
Vladimir Sementsov-Ogievskiy
526e31de99 block/qcow2-bitmap: return status from qcow2_store_persistent_dirty_bitmaps
It's better to return status together with setting errp. It makes
possible to avoid error propagation.

While being here, put ERRP_GUARD() to fix error_prepend(errp, ...)
usage inside qcow2_store_persistent_dirty_bitmaps() (see the comment
above ERRP_GUARD() definition in include/qapi/error.h)

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20210202124956.63146-11-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-03-08 16:03:21 -06:00
Vladimir Sementsov-Ogievskiy
0c1e9d2a9a block/qcow2-bitmap: improve qcow2_load_dirty_bitmaps() interface
It's recommended for bool functions with errp to return true on success
and false on failure. Non-standard interfaces don't help to understand
the code. The change is also needed to reduce error propagation.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210202124956.63146-10-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-03-08 16:01:47 -06:00
Vladimir Sementsov-Ogievskiy
83bad8cbf5 block/qcow2: qcow2_get_specific_info(): drop error propagation
Don't use error propagation in qcow2_get_specific_info(). For this
refactor qcow2_get_bitmap_info_list, its current interface is rather
weird.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210202124956.63146-9-vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
[eblake: separate local 'tail' variable from 'info_list' parameter]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-03-08 15:16:11 -06:00
Vladimir Sementsov-Ogievskiy
eb5becc18f block/mirror: drop extra error propagation in commit_active_start()
Let's check return value of mirror_start_job to check for failure
instead of local_err.

Rename ret to job, as ret is usually integer variable.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20210202124956.63146-7-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-03-08 15:14:14 -06:00
Vladimir Sementsov-Ogievskiy
bc52024959 block: check return value of bdrv_open_child and drop error propagation
This patch is generated by cocci script:

@@
symbol bdrv_open_child, errp, local_err;
expression file;
@@

  file = bdrv_open_child(...,
-                        &local_err
+                        errp
                        );
- if (local_err)
+ if (!file)
  {
      ...
-     error_propagate(errp, local_err);
      ...
  }

with command

spatch --sp-file x.cocci --macro-file scripts/cocci-macro-file.h \
--in-place --no-show-diff --max-width 80 --use-gitgrep block

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20210202124956.63146-4-vsementsov@virtuozzo.com>
[eblake: fix qcow2_do_open() to use ERRP_GUARD, necessary as the only
caller to pass allow_none=true]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-03-08 15:07:09 -06:00
Vladimir Sementsov-Ogievskiy
baefd97700 parallels: support bitmap extension for read-only mode
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210224104707.88430-5-vsementsov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:56:55 +01:00
Vladimir Sementsov-Ogievskiy
e0b5207f54 block/parallels: BDRVParallelsState: add cluster_size field
We are going to use it in more places, calculating
"s->tracks << BDRV_SECTOR_BITS" doesn't look good.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210224104707.88430-4-vsementsov@virtuozzo.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:56:54 +01:00
Vladimir Sementsov-Ogievskiy
35f428ba39 qcow2-bitmap: make bytes_covered_by_bitmap_cluster() public
Rename bytes_covered_by_bitmap_cluster() to
bdrv_dirty_bitmap_serialization_coverage() and make it public.
It is needed as we are going to share it with bitmap loading in
parallels format.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-Id: <20210224104707.88430-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:56:54 +01:00
Stefan Hajnoczi
05ae4e674e block/export: port virtio-blk read/write range check
Check that the sector number and byte count are valid.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210223144653.811468-13-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:56:54 +01:00
Stefan Hajnoczi
db4eadf9f1 block/export: port virtio-blk discard/write zeroes input validation
Validate discard/write zeroes the same way we do for virtio-blk. Some of
these checks are mandated by the VIRTIO specification, others are
internal to QEMU.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210223144653.811468-11-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:56:54 +01:00
Stefan Hajnoczi
e44362ce31 block/export: fix vhost-user-blk export sector number calculation
The driver is supposed to honor the blk_size field but the protocol
still uses 512-byte sector numbers. It is incorrect to multiply
req->sector_num by blk_size.

VIRTIO 1.1 5.2.5 Device Initialization says:

  blk_size can be read to determine the optimal sector size for the
  driver to use. This does not affect the units used in the protocol
  (always 512 bytes), but awareness of the correct value can affect
  performance.

Fixes: 3578389bcf ("block/export: vhost-user block device backend server")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210223144653.811468-10-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:56:54 +01:00
Stefan Hajnoczi
524bac0744 block/export: use VIRTIO_BLK_SECTOR_BITS
Use VIRTIO_BLK_SECTOR_BITS and VIRTIO_BLK_SECTOR_SIZE when dealing with
virtio-blk sector numbers. Although the values happen to be the same as
BDRV_SECTOR_BITS and BDRV_SECTOR_SIZE, they are conceptually different.
This makes it clearer when we are dealing with virtio-blk sector units.

Use VIRTIO_BLK_SECTOR_BITS in vu_blk_initialize_config(). Later patches
will use it the new constants the virtqueue request processing code
path.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210223144653.811468-9-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:56:54 +01:00
Stefan Hajnoczi
a4f1542af5 block/export: fix blk_size double byteswap
The config->blk_size field is little-endian. Use the native-endian
blk_size variable to avoid double byteswapping.

Fixes: 11f60f7eae ("block/export: make vhost-user-blk config space little-endian")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210223144653.811468-8-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:56:54 +01:00
Max Reitz
705dde27c6 backup-top: Refuse I/O in inactive state
When the backup-top node transitions from active to inactive in
bdrv_backup_top_drop(), the BlockCopyState is freed and the filtered
child is removed, so the node effectively becomes unusable.

However, noone told its I/O functions this, so they will happily
continue accessing bs->backing and s->bcs.  Prevent that by aborting
early when s->active is false.

(After the preceding patch, the node should be gone after
bdrv_backup_top_drop(), so this should largely be a theoretical problem.
But still, better to be safe than sorry, and also I think it just makes
sense to check s->active in the I/O functions.)

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210219153348.41861-3-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:55:18 +01:00
Max Reitz
bdc4c4c5e3 backup: Remove nodes from job in .clean()
The block job holds a reference to the backup-top node (because it is
passed as the main job BDS to block_job_create()).  Therefore,
bdrv_backup_top_drop() cannot delete the backup-top node (replacing it
by its child does not affect the job parent, because that has
.stay_at_node set).  That is a problem, because all of its I/O functions
assume the BlockCopyState (s->bcs) to be valid and that it has a
filtered child; but after bdrv_backup_top_drop(), neither of those
things are true.

It does not make sense to add new parents to backup-top after
backup_clean(), so we should detach it from the job before
bdrv_backup_top_drop().  Because there is no function to do that for a
single node, just detach all of the job's nodes -- the job does not do
anything past backup_clean() anyway.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210219153348.41861-2-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-03-08 14:55:18 +01:00
Paolo Bonzini
f7544edcd3 qemu-config: add error propagation to qemu_config_parse
This enables some simplification of vl.c via error_fatal, and improves
error messages.  Before:

  $ ./qemu-system-x86_64 -readconfig .
  qemu-system-x86_64: error reading file
  qemu-system-x86_64: -readconfig .: read config .: Invalid argument
  $ /usr/libexec/qemu-kvm -readconfig foo
  qemu-kvm: -readconfig foo: read config foo: No such file or directory

After:

  $ ./qemu-system-x86_64 -readconfig .
  qemu-system-x86_64: -readconfig .: Cannot read config file: Is a directory
  $ ./qemu-system-x86_64 -readconfig foo
  qemu-system-x86_64: -readconfig foo: Could not open 'foo': No such file or directory

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210226170816.231173-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-06 11:41:54 +01:00
Maxim Levitsky
6094cbeb72 block: qcow2: remove the created file on initialization error
If the qcow initialization fails, we should remove the file if it was
already created, to avoid leaving stale files around.

We already do this for luks raw images.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20201217170904.946013-4-mlevitsk@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-02-15 15:10:14 +01:00
Maxim Levitsky
a890f08e58 block: add bdrv_co_delete_file_noerr
This function wraps bdrv_co_delete_file for the common case of removing a file,
which was just created by format driver, on an error condition.

It hides the -ENOTSUPP error, and reports all other errors otherwise.

Use it in luks driver

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20201217170904.946013-3-mlevitsk@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-02-15 15:10:14 +01:00
Maxim Levitsky
dcb6699512 crypto: luks: Fix tiny memory leak
When the underlying block device doesn't support the
bdrv_co_delete_file interface, an 'Error' object was leaked.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201217170904.946013-2-mlevitsk@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-02-15 15:10:14 +01:00
Peter Maydell
392b9a74b9 bitmaps patches for 2021-02-12
- add 'transform' member to manipulate bitmaps across migration
 - work towards better error handling during bdrv_open
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAmAnDQsACgkQp6FrSiUn
 Q2qc5Qf/SKVdpX4j7OnHF6sBuf/8LVWz4KazSqEU0ohazBJmafgJpH2EA5pXMXR4
 frZDWeanGmhj1MjMkta/++uvEBU/TMpW2z98mZvjErteXdnRQAlII/hOCI+QZJvg
 viQ5t1EyrkyXzUePOjs+AwqA5KHWbCKt6QqyItQ78HvI23sw/fuvHj0G67KbVzXZ
 VcSrVr0J7PXnZV/hWfg+C+Nn9Ro9tsVdn79awLYVQ7/SDro3hzylpcHMQaHMK2oe
 mX4D2kNq7s21E27Zb6vlknUhQPkMdETk0gfEbpn7sTVMEc58GRLC7Tqfx7l0JIFK
 5izVyA5vndKVxDGYPkbDK6VL2uDg4A==
 =+Epy
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ericb/tags/pull-bitmaps-2021-02-12' into staging

bitmaps patches for 2021-02-12

- add 'transform' member to manipulate bitmaps across migration
- work towards better error handling during bdrv_open

# gpg: Signature made Fri 12 Feb 2021 23:19:39 GMT
# gpg:                using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full]
# gpg:                 aka "[jpeg image of size 6874]" [full]
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-bitmaps-2021-02-12:
  block: use return status of bdrv_append()
  block: return status from bdrv_append and friends
  qemu-iotests: 300: Add test case for modifying persistence of bitmap
  migration: dirty-bitmap: Allow control of bitmap persistence
  migration: dirty-bitmap: Use struct for alias map inner members

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-02-13 21:26:00 +00:00
Vladimir Sementsov-Ogievskiy
934aee14d3 block: use return status of bdrv_append()
Now bdrv_append returns status and we can drop all the local_err things
around it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20210202124956.63146-3-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-12 15:39:44 -06:00
Vladimir Sementsov-Ogievskiy
ff789bf5a9 block/backup: implement .cancel job handler
Cancel in-flight io on target to not waste the time.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210205163720.887197-10-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-12 12:17:08 -06:00
Vladimir Sementsov-Ogievskiy
521ff8b779 block/mirror: implement .cancel job handler
Cancel in-flight io on target to not waste the time.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210205163720.887197-6-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-12 11:29:40 -06:00
Vladimir Sementsov-Ogievskiy
3fc1ec3725 block/raw-format: implement .bdrv_cancel_in_flight handler
We are going to cancel in-flight requests on mirror nbd target on job
cancel. Still nbd is often used not directly but as raw-format child.
So, add pass-through handler here.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210205163720.887197-4-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-12 09:45:18 -06:00
Vladimir Sementsov-Ogievskiy
c4f7f24e1f block/nbd: implement .bdrv_cancel_in_flight
Just stop waiting for connection in existing requests.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210205163720.887197-3-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-12 09:45:18 -06:00
Vladimir Sementsov-Ogievskiy
bd54669a4a block: add new BlockDriver handler: bdrv_cancel_in_flight
It will be used to stop retrying NBD requests on mirror cancel.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210205163720.887197-2-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-12 09:45:18 -06:00
Daniel P. Berrangé
3d3e9b1f66 block: rename and alter bdrv_all_find_snapshot semantics
Currently bdrv_all_find_snapshot() will return 0 if it finds
a snapshot, -1 if an error occurs, or if it fails to find a
snapshot. New callers to be added want to distinguish between
the error scenario and failing to find a snapshot.

Rename it to bdrv_all_has_snapshot and make it return -1 on
error, 0 if no snapshot is found and 1 if snapshot is found.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-7-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
c22d644ca7 block: allow specifying name of block device for vmstate storage
Currently the vmstate will be stored in the first block device that
supports snapshots. Historically this would have usually been the
root device, but with UEFI it might be the variable store. There
needs to be a way to override the choice of block device to store
the state in.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-6-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
cf3a74c94f block: add ability to specify list of blockdevs during snapshot
When running snapshot operations, there are various rules for which
blockdevs are included/excluded. While this provides reasonable default
behaviour, there are scenarios that are not well handled by the default
logic. Some of the conditions do not have a single correct answer.

Thus there needs to be a way for the mgmt app to provide an explicit
list of blockdevs to perform snapshots across. This can be achieved
by passing a list of node names that should be used.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210204124834.774401-5-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Daniel P. Berrangé
e26f98e209 block: push error reporting into bdrv_all_*_snapshot functions
The bdrv_all_*_snapshot functions return a BlockDriverState pointer
for the invalid backend, which the callers then use to report an
error message. In some cases multiple callers are reporting the
same error message, but with slightly different text. In the future
there will be more error scenarios for some of these methods, which
will benefit from fine grained error message reporting. So it is
helpful to push error reporting down a level.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
[PMD: Initialize variables]
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210204124834.774401-2-berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-02-08 11:19:51 +00:00
Roman Kagan
ddde5ee769 block/nbd: only enter connection coroutine if it's present
When an NBD block driver state is moved from one aio_context to another
(e.g. when doing a drain in a migration thread),
nbd_client_attach_aio_context_bh is executed that enters the connection
coroutine.

However, the assumption that ->connection_co is always present here
appears incorrect: the connection may have encountered an error other
than -EIO in the underlying transport, and thus may have decided to quit
rather than keep trying to reconnect, and therefore it may have
terminated the connection coroutine.  As a result an attempt to reassign
the client in this state (NBD_CLIENT_QUIT) to a different aio_context
leads to a null pointer dereference:

  #0  qio_channel_detach_aio_context (ioc=0x0)
      at /build/qemu-gYtjVn/qemu-5.0.1/io/channel.c:452
  #1  0x0000562a242824b3 in bdrv_detach_aio_context (bs=0x562a268d6a00)
      at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6151
  #2  bdrv_set_aio_context_ignore (bs=bs@entry=0x562a268d6a00,
      new_context=new_context@entry=0x562a260c9580,
      ignore=ignore@entry=0x7feeadc9b780)
      at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6230
  #3  0x0000562a24282969 in bdrv_child_try_set_aio_context
      (bs=bs@entry=0x562a268d6a00, ctx=0x562a260c9580,
      ignore_child=<optimized out>, errp=<optimized out>)
      at /build/qemu-gYtjVn/qemu-5.0.1/block.c:6332
  #4  0x0000562a242bb7db in blk_do_set_aio_context (blk=0x562a2735d0d0,
      new_context=0x562a260c9580,
      update_root_node=update_root_node@entry=true, errp=errp@entry=0x0)
      at /build/qemu-gYtjVn/qemu-5.0.1/block/block-backend.c:1989
  #5  0x0000562a242be0bd in blk_set_aio_context (blk=<optimized out>,
      new_context=<optimized out>, errp=errp@entry=0x0)
      at /build/qemu-gYtjVn/qemu-5.0.1/block/block-backend.c:2010
  #6  0x0000562a23fbd953 in virtio_blk_data_plane_stop (vdev=<optimized
      out>)
      at /build/qemu-gYtjVn/qemu-5.0.1/hw/block/dataplane/virtio-blk.c:292
  #7  0x0000562a241fc7bf in virtio_bus_stop_ioeventfd (bus=0x562a260dbf08)
      at /build/qemu-gYtjVn/qemu-5.0.1/hw/virtio/virtio-bus.c:245
  #8  0x0000562a23fefb2e in virtio_vmstate_change (opaque=0x562a260dbf90,
      running=0, state=<optimized out>)
      at /build/qemu-gYtjVn/qemu-5.0.1/hw/virtio/virtio.c:3220
  #9  0x0000562a2402ebfd in vm_state_notify (running=running@entry=0,
      state=state@entry=RUN_STATE_FINISH_MIGRATE)
      at /build/qemu-gYtjVn/qemu-5.0.1/softmmu/vl.c:1275
  #10 0x0000562a23f7bc02 in do_vm_stop (state=RUN_STATE_FINISH_MIGRATE,
      send_stop=<optimized out>)
      at /build/qemu-gYtjVn/qemu-5.0.1/cpus.c:1032
  #11 0x0000562a24209765 in migration_completion (s=0x562a260e83a0)
      at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:2914
  #12 migration_iteration_run (s=0x562a260e83a0)
      at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:3275
  #13 migration_thread (opaque=opaque@entry=0x562a260e83a0)
      at /build/qemu-gYtjVn/qemu-5.0.1/migration/migration.c:3439
  #14 0x0000562a2435ca96 in qemu_thread_start (args=<optimized out>)
      at /build/qemu-gYtjVn/qemu-5.0.1/util/qemu-thread-posix.c:519
  #15 0x00007feed31466ba in start_thread (arg=0x7feeadc9c700)
      at pthread_create.c:333
  #16 0x00007feed2e7c41d in __GI___sysctl (name=0x0, nlen=608471908,
      oldval=0x562a2452b138, oldlenp=0x0, newval=0x562a2452c5e0
      <__func__.28102>, newlen=0)
      at ../sysdeps/unix/sysv/linux/sysctl.c:30
  #17 0x0000000000000000 in ?? ()

Fix it by checking that the connection coroutine is non-null before
trying to enter it.  If it is null, no entering is needed, as the
connection is probably going down anyway.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210129073859.683063-3-rvkagan@yandex-team.ru>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:17:12 -06:00
Roman Kagan
3b5e4db673 block/nbd: only detach existing iochannel from aio_context
When the reconnect in NBD client is in progress, the iochannel used for
NBD connection doesn't exist.  Therefore an attempt to detach it from
the aio_context of the parent BlockDriverState results in a NULL pointer
dereference.

The problem is triggerable, in particular, when an outgoing migration is
about to finish, and stopping the dataplane tries to move the
BlockDriverState from the iothread aio_context to the main loop.  If the
NBD connection is lost before this point, and the NBD client has entered
the reconnect procedure, QEMU crashes:

  #0  qemu_aio_coroutine_enter (ctx=0x5618056c7580, co=0x0)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-coroutine.c:109
  #1  0x00005618034b1b68 in nbd_client_attach_aio_context_bh (
      opaque=0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block/nbd.c:164
  #2  0x000056180353116b in aio_wait_bh (opaque=0x7f60e1e63700)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:55
  #3  0x0000561803530633 in aio_bh_call (bh=0x7f60d40a7e80)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:136
  #4  aio_bh_poll (ctx=ctx@entry=0x5618056c7580)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/async.c:164
  #5  0x0000561803533e5a in aio_poll (ctx=ctx@entry=0x5618056c7580,
      blocking=blocking@entry=true)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-posix.c:650
  #6  0x000056180353128d in aio_wait_bh_oneshot (ctx=0x5618056c7580,
      cb=<optimized out>, opaque=<optimized out>)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/aio-wait.c:71
  #7  0x000056180345c50a in bdrv_attach_aio_context (new_context=0x5618056c7580,
      bs=0x561805ed4c00) at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6172
  #8  bdrv_set_aio_context_ignore (bs=bs@entry=0x561805ed4c00,
      new_context=new_context@entry=0x5618056c7580,
      ignore=ignore@entry=0x7f60e1e63780)
      at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6237
  #9  0x000056180345c969 in bdrv_child_try_set_aio_context (
      bs=bs@entry=0x561805ed4c00, ctx=0x5618056c7580,
      ignore_child=<optimized out>, errp=<optimized out>)
      at /build/qemu-6MF7tq/qemu-5.0.1/block.c:6332
  #10 0x00005618034957db in blk_do_set_aio_context (blk=0x56180695b3f0,
      new_context=0x5618056c7580, update_root_node=update_root_node@entry=true,
      errp=errp@entry=0x0)
      at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:1989
  #11 0x00005618034980bd in blk_set_aio_context (blk=<optimized out>,
      new_context=<optimized out>, errp=errp@entry=0x0)
      at /build/qemu-6MF7tq/qemu-5.0.1/block/block-backend.c:2010
  #12 0x0000561803197953 in virtio_blk_data_plane_stop (vdev=<optimized out>)
      at /build/qemu-6MF7tq/qemu-5.0.1/hw/block/dataplane/virtio-blk.c:292
  #13 0x00005618033d67bf in virtio_bus_stop_ioeventfd (bus=0x5618056d9f08)
      at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio-bus.c:245
  #14 0x00005618031c9b2e in virtio_vmstate_change (opaque=0x5618056d9f90,
      running=0, state=<optimized out>)
      at /build/qemu-6MF7tq/qemu-5.0.1/hw/virtio/virtio.c:3220
  #15 0x0000561803208bfd in vm_state_notify (running=running@entry=0,
      state=state@entry=RUN_STATE_FINISH_MIGRATE)
      at /build/qemu-6MF7tq/qemu-5.0.1/softmmu/vl.c:1275
  #16 0x0000561803155c02 in do_vm_stop (state=RUN_STATE_FINISH_MIGRATE,
      send_stop=<optimized out>) at /build/qemu-6MF7tq/qemu-5.0.1/cpus.c:1032
  #17 0x00005618033e3765 in migration_completion (s=0x5618056e6960)
      at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:2914
  #18 migration_iteration_run (s=0x5618056e6960)
      at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3275
  #19 migration_thread (opaque=opaque@entry=0x5618056e6960)
      at /build/qemu-6MF7tq/qemu-5.0.1/migration/migration.c:3439
  #20 0x0000561803536ad6 in qemu_thread_start (args=<optimized out>)
      at /build/qemu-6MF7tq/qemu-5.0.1/util/qemu-thread-posix.c:519
  #21 0x00007f61085d06ba in start_thread ()
     from /lib/x86_64-linux-gnu/libpthread.so.0
  #22 0x00007f610830641d in sysctl () from /lib/x86_64-linux-gnu/libc.so.6
  #23 0x0000000000000000 in ?? ()

Fix it by checking that the iochannel is non-null before trying to
detach it from the aio_context.  If it is null, no detaching is needed,
and it will get reattached in the proper aio_context once the connection
is reestablished.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210129073859.683063-2-rvkagan@yandex-team.ru>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:17:12 -06:00
Vladimir Sementsov-Ogievskiy
a5215b8fdf block/io: use int64_t bytes in copy_range
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.

Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.

We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).

So, convert now copy_range parameters which are already 64bit to signed
type.

It's safe as we don't work with requests overflowing BDRV_MAX_LENGTH
(which is less than INT64_MAX), and do check the requests in
bdrv_co_copy_range_internal() (by bdrv_check_request32(), which calls
bdrv_check_request()).

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-17-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:17:12 -06:00
Vladimir Sementsov-Ogievskiy
e9e52efdc5 block/io: support int64_t bytes in read/write wrappers
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.

Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.

We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).

Now, since bdrv_co_preadv_part() and bdrv_co_pwritev_part() have been
updated, update all their wrappers.

For all of them type of 'bytes' is widening, so callers are safe. We
have update request_fn in blkverify.c simultaneously. Still it's just a
pointer to one of bdrv_co_pwritev() or bdrv_co_preadv(), and type is
widening for callers of the request_fn anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-16-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:17:12 -06:00
Vladimir Sementsov-Ogievskiy
37e9403ea8 block/io: support int64_t bytes in bdrv_co_p{read,write}v_part()
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.

Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.

We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).

So, prepare bdrv_co_preadv_part() and bdrv_co_pwritev_part() and their
remaining dependencies now.

bdrv_pad_request() is updated simultaneously, as pointer to bytes passed
to it both from bdrv_co_pwritev_part() and bdrv_co_preadv_part().

So, all callers of bdrv_pad_request() are updated to pass 64bit bytes.
bdrv_pad_request() is already good for 64bit requests, add
corresponding assertion.

Look at bdrv_co_preadv_part() and bdrv_co_pwritev_part().
Type is widening, so callers are safe. Let's look inside the functions.

In bdrv_co_preadv_part() and bdrv_aligned_pwritev() we only pass bytes
to other already int64_t interfaces (and some obviously safe
calculations), it's OK.

In bdrv_co_do_zero_pwritev() aligned_bytes may become large now, still
it's passed to bdrv_aligned_pwritev which supports int64_t bytes.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-15-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:17:11 -06:00
Vladimir Sementsov-Ogievskiy
8b0c5d7659 block/io: support int64_t bytes in bdrv_aligned_preadv()
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.

Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.

We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).

So, prepare bdrv_aligned_preadv() now.

Make the bytes variable in bdrv_padding_rmw_read() int64_t, as it is
only used for pass-through to bdrv_aligned_preadv().

All bdrv_aligned_preadv() callers are safe as type is widening. Let's
look inside:

 - add a new-style assertion that request is good.
 - callees bdrv_is_allocated(), bdrv_co_do_copy_on_readv() supports
   int64_t bytes
 - conversion of bytes_remaining is OK, as we never have requests
   overflowing BDRV_MAX_LENGTH
 - looping through bytes_remaining is ok, num is updated to int64_t
   - for bdrv_driver_preadv we have same limit of max_transfer
   - qemu_iovec_memset is OK, as bytes+qiov_offset should not overflow
     qiov->size anyway (thanks to bdrv_check_qiov_request())

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-14-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:17:11 -06:00
Vladimir Sementsov-Ogievskiy
9df5afbdd1 block/io: support int64_t bytes in bdrv_co_do_copy_on_readv()
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.

Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.

We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).

So, prepare bdrv_co_do_copy_on_readv() now.

'bytes' type widening, so callers are safe. Look at the function
itself:

bytes, skip_bytes and progress become int64_t.

bdrv_round_to_clusters() is OK, cluster_bytes now may be large.
trace_bdrv_co_do_copy_on_readv() is OK

looping through cluster_bytes is still OK.

pnum is still capped to max_transfer, and to MAX_BOUNCE_BUFFER when we
are going to do COR operation. Therefor calculations in
qemu_iovec_from_buf() and bdrv_driver_preadv() should not change.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-13-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:17:11 -06:00
Vladimir Sementsov-Ogievskiy
fcfd9ade68 block/io: support int64_t bytes in bdrv_aligned_pwritev()
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.

Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.

We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).

So, prepare bdrv_aligned_pwritev() now and convert the dependencies:
bdrv_co_write_req_prepare() and bdrv_co_write_req_finish() to signed
type bytes.

Conversion of bdrv_co_write_req_prepare() and
bdrv_co_write_req_finish() is definitely safe, as all requests in
block/io must not overflow BDRV_MAX_LENGTH. Still add assertions.

For bdrv_aligned_pwritev() 'bytes' type is widened, so callers are
safe. Let's check usage of the parameter inside the function.

Passing to bdrv_co_write_req_prepare() and bdrv_co_write_req_finish()
is OK.

Passing to qemu_iovec_* is OK after new assertion. All other callees
are already updated to int64_t.

Checking alignment is not changed, offset + bytes and qiov_offset +
bytes calculations are safe (thanks to new assertions).

max_transfer is kept to be int for now. It has a default of INT_MAX
here, and some drivers may rely on it. It's to be refactored later.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-12-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:16:03 -06:00
Vladimir Sementsov-Ogievskiy
5ae07b1410 block/io: support int64_t bytes in bdrv_co_do_pwrite_zeroes()
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.

Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.

We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).

So, prepare bdrv_co_do_pwrite_zeroes() now.

Callers are safe, as converting int to int64_t is safe. Concentrate on
'bytes' usage in the function (thx to Eric Blake):

    compute 'int tail' via % 'int alignment' - safe
    fragmentation loop 'int num' - still fragments with a cap on
      max_transfer

    use of 'num' within the loop
    MIN(bytes, max_transfer) as well as %alignment - still works, so
         calculations in if (head) {} are safe
    clamp size by 'int max_write_zeroes' - safe
    drv->bdrv_co_pwrite_zeroes(int) - safe because of clamping
    clamp size by 'int max_transfer' - safe
    buf allocation is still clamped to max_transfer
    qemu_iovec_init_buf(size_t) - safe because of clamping
    bdrv_driver_pwritev(uint64_t) - safe

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-11-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:16:03 -06:00
Vladimir Sementsov-Ogievskiy
17abcbeee2 block/io: use int64_t bytes in driver wrappers
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.

Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.

We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).

So, convert driver wrappers parameters which are already 64bit to
signed type.

Requests in block/io.c must never exceed BDRV_MAX_LENGTH (which is less
than INT64_MAX), which makes the conversion to signed 64bit type safe.

Add corresponding assertions.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-10-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:16:03 -06:00
Eric Blake
8024726459 block: use int64_t as bytes type in tracked requests
We are generally moving to int64_t for both offset and bytes parameters
on all io paths.

Main motivation is realization of 64-bit write_zeroes operation for
fast zeroing large disk chunks, up to the whole disk.

We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).

All requests in block/io must not overflow BDRV_MAX_LENGTH, all
external users of BdrvTrackedRequest already have corresponding
assertions, so we are safe. Add some assertions still.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-9-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:14:15 -06:00
Vladimir Sementsov-Ogievskiy
63f4ad1186 block/io: improve bdrv_check_request: check qiov too
Operations with qiov add more restrictions on bytes, let's cover it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-8-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:14:00 -06:00
Vladimir Sementsov-Ogievskiy
801625e69d block/throttle-groups: throttle_group_co_io_limits_intercept(): 64bit bytes
The function is called from 64bit io handlers, and bytes is just passed
to throttle_account() which is 64bit too (unsigned though). So, let's
convert intermediate argument to 64bit too.

This patch is a first in the 64-bit-blocklayer series, so we are
generally moving to int64_t for both offset and bytes parameters on all
io paths. Main motivation is realization of 64-bit write_zeroes
operation for fast zeroing large disk chunks, up to the whole disk.

We chose signed type, to be consistent with off_t (which is signed) and
with possibility for signed return type (where negative value means
error).

Patch-correctness audit by Eric Blake:

  Caller has 32-bit, this patch now causes widening which is safe:
  block/block-backend.c: blk_do_preadv() passes 'unsigned int'
  block/block-backend.c: blk_do_pwritev_part() passes 'unsigned int'
  block/throttle.c: throttle_co_pwrite_zeroes() passes 'int'
  block/throttle.c: throttle_co_pdiscard() passes 'int'

  Caller has 64-bit, this patch fixes potential bug where pre-patch
  could narrow, except it's easy enough to trace that callers are still
  capped at 2G actions:
  block/throttle.c: throttle_co_preadv() passes 'uint64_t'
  block/throttle.c: throttle_co_pwritev() passes 'uint64_t'

  Implementation in question: block/throttle-groups.c
  throttle_group_co_io_limits_intercept() takes 'unsigned int bytes'
  and uses it: argument to util/throttle.c throttle_account(uint64_t)

  All safe: it patches a latent bug, and does not introduce any 64-bit
  gotchas once throttle_co_p{read,write}v are relaxed, and assuming
  throttle_account() is not buggy.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20201211183934.169161-7-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:14:00 -06:00
Vladimir Sementsov-Ogievskiy
98ca45494f block/io: bdrv_pad_request(): support qemu_iovec_init_extended failure
Make bdrv_pad_request() honest: return error if
qemu_iovec_init_extended() failed.

Update also bdrv_padding_destroy() to clean the structure for safety.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-6-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:14:00 -06:00
Vladimir Sementsov-Ogievskiy
f0deecff82 block/io: refactor bdrv_pad_request(): move bdrv_pad_request() up
Prepare for the following patch when bdrv_pad_request() will be able to
fail. Update the comments.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-5-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: grammar tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:00:52 -06:00
Vladimir Sementsov-Ogievskiy
a56ed80c42 block: fix theoretical overflow in bdrv_init_padding()
Calculation of sum may theoretically overflow, so use 64bit type and
add some good assertions.

Use int64_t constantly.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-4-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: tweak assertion order]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:00:33 -06:00
Vladimir Sementsov-Ogievskiy
4c002cef0e util/iov: make qemu_iovec_init_extended() honest
Actually, we can't extend the io vector in all cases. Handle possible
MAX_IOV and size_t overflows.

For now add assertion to callers (actually they rely on success anyway)
and fix them in the following patch.

Add also some additional good assertions to qemu_iovec_init_slice()
while being here.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:00:33 -06:00
Vladimir Sementsov-Ogievskiy
69b55e03f7 block: refactor bdrv_check_request: add errp
It's better to pass &error_abort than just assert that result is 0: on
crash, we'll immediately see the reason in the backtrace.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-2-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: fix iotest 206 fallout]
Signed-off-by: Eric Blake <eblake@redhat.com>
2021-02-03 08:00:33 -06:00
Kevin Wolf
26513a0174 block: Fix VM size column width in bdrv_snapshot_dump()
size_to_str() can return a size like "4.24 MiB", with a single digit
integer part and two fractional digits. This is eight characters, but
commit b39847a5 changed the format string to only reserve seven
characters for the column.

This can result in unaligned columns, which in turn changes the output of
iotests case 267 because exceeding the column size defeats the attempt
to filter the size out of the output (observed with the ppc64 emulator).
The resulting change is only a whitespace change, but since commit
f203080b this is enough for iotests to consider the test failed.

Taking a character away from the tag name column and adding it to the VM
size column doesn't change anything in the common case (the tag name is
left justified, the VM size is right justified), but fixes this case.

Fixes: b39847a505
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210202155911.179865-1-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-02-02 17:23:55 +01:00