Commit Graph

1580 Commits

Author SHA1 Message Date
Emanuele Giuseppe Esposito
a41cfda126 block: rename bdrv_child_try_change_aio_context in bdrv_try_change_aio_context
No functional changes intended.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-10-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Emanuele Giuseppe Esposito
d2aafbb68a block: remove all unused ->can_set_aio_ctx and ->set_aio_ctx callbacks
Together with all _can_set_ and _set_ APIs, as they are not needed
anymore.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-9-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Emanuele Giuseppe Esposito
f8be48adf0 block: use the new _change_ API instead of _can_set_ and _set_
Replace all direct usage of ->can_set_aio_ctx and ->set_aio_ctx,
and call bdrv_child_try_change_aio_context() in
bdrv_try_set_aio_context(), the main function called through
the whole block layer.

From this point onwards, ->can_set_aio_ctx and ->set_aio_ctx
won't be used anymore.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-8-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Emanuele Giuseppe Esposito
27633e740e block: implement .change_aio_ctx in child_of_bds
bdrv_child_cb_change_aio_ctx() is identical to
bdrv_child_cb_can_set_aio_ctx(), as we only need
to recursively go on the parent bs.

Note: bdrv_child_try_change_aio_context() is not called by
anyone at this point.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-6-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Emanuele Giuseppe Esposito
e08cc001d5 bdrv_change_aio_context: use hash table instead of list of visited nodes
Minor performance improvement, but given that we have hash tables
available, avoid iterating in the visited nodes list every time just
to check if a node has been already visited.

The data structure is not actually a proper hash map, but an hash set,
as we are just adding nodes and not key,value pairs.

Suggested-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20221025084952.2139888-4-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Emanuele Giuseppe Esposito
7e8c182fb5 block: use transactions as a replacement of ->{can_}set_aio_context()
Simplify the way the aiocontext can be changed in a BDS graph.
There are currently two problems in bdrv_try_set_aio_context:
- There is a confusion of AioContext locks taken and released, because
  we assume that old aiocontext is always taken and new one is
  taken inside.

- It doesn't look very safe to call bdrv_drained_begin while some
  nodes have already switched to the new aiocontext and others haven't.
  This could be especially dangerous because bdrv_drained_begin polls, so
  something else could be executed while graph is in an inconsistent
  state.

Additional minor nitpick: can_set and set_ callbacks both traverse the
graph, both using the ignored list of visited nodes in a different way.

Therefore, get rid of all of this and introduce a new callback,
change_aio_context, that uses transactions to efficiently, cleanly
and most importantly safely change the aiocontext of a graph.

This new callback is a "merge" of the two previous ones:
- Just like can_set_aio_context, recursively traverses the graph.
  Marks all nodes that are visited using a GList, and checks if
  they *could* change the aio_context.
- For each node that passes the above check, drain it and add a new transaction
  that implements a callback that effectively changes the aiocontext.
- Once done, the recursive function returns if *all* nodes can change
  the AioContext. If so, commit the above transactions.
  Regardless of the outcome, call transaction.clean() to undo all drains
  done in the recursion.
- The transaction list is scanned only after all nodes are being drained, so
  we are sure that they all are in the same context, and then
  we switch their AioContext, concluding the drain only after all nodes
  switched to the new AioContext. In this way we make sure that
  bdrv_drained_begin() is always called under the old AioContext, and
  bdrv_drained_end() under the new one.
- Because of the above, we don't need to release and re-acquire the
  old AioContext every time, as everything is done once (and not
  per-node drain and aiocontext change).

Note that the "change" API is not yet invoked anywhere.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20221025084952.2139888-3-eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Emanuele Giuseppe Esposito
7f898610f6 block.c: assert bs->aio_context is written under BQL and drains
Also here ->aio_context is read by I/O threads and written
under BQL.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20221025084952.2139888-2-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
57f08941d3 block: refactor bdrv_remove_file_or_backing_child to bdrv_remove_child
Now the function can remove any child, so give it more common name.
Drop assertions and drop bs argument which becomes unused. Function
would be reused in a further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-16-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
5bb0474778 block: Manipulate bs->file / bs->backing pointers in .attach/.detach
bs->file and bs->backing are a kind of duplication of part of
bs->children. But very useful diplication, so let's not drop them at
all:)

We should manage bs->file and bs->backing in same place, where we
manage bs->children, to keep them in sync.

Moreover, generic io paths are unprepared to BdrvChild without a bs, so
it's double good to clear bs->file / bs->backing when we detach the
child.

Detach is simple: if we detach bs->file or bs->backing child, just
set corresponding field to NULL.

Attach is a bit more complicated. But we still can precisely detect
should we set one of bs->file / bs->backing or not:

- if role is BDRV_CHILD_COW, we definitely deal with bs->backing
- else, if role is BDRV_CHILD_FILTERED (it must be also
  BDRV_CHILD_PRIMARY), it's a filtered child. Use
  bs->drv->filtered_child_is_backing to chose the pointer field to
  modify.
- else, if role is BDRV_CHILD_PRIMARY, we deal with bs->file
- in all other cases, it's neither bs->backing nor bs->file. It's some
  other child and we shouldn't care

OK. This change brings one more good thing: we can (and should) get rid
of all indirect pointers in the block-graph-change transactions:

bdrv_attach_child_common() stores BdrvChild** into transaction to clear
it on abort.

bdrv_attach_child_common() has two callers: bdrv_attach_child_noperm()
just pass-through this feature, bdrv_root_attach_child() doesn't need
the feature.

Look at bdrv_attach_child_noperm() callers:
  - bdrv_attach_child() doesn't need the feature
  - bdrv_set_file_or_backing_noperm() uses the feature to manage
    bs->file and bs->backing, we don't want it anymore
  - bdrv_append() uses the feature to manage bs->backing, again we
    don't want it anymore

So, we should drop this stuff! Great!

We could probably keep BdrvChild** argument to keep the int return
value, but it seems not worth the complexity.

Finally, we now set .file / .backing automatically in generic code and
want to restring setting them by hand outside of .attach/.detach.
So, this patch cleanups all remaining places where they were set.
To find such places I use:

  git grep '\->file ='
  git grep '\->backing ='
  git grep '&.*\<backing\>'
  git grep '&.*\<file\>'

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-14-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
544acc7d1e Revert "block: Pass BdrvChild ** to replace_child_noperm"
That's a preparation to previously reverted
"block: Let replace_child_noperm free children". Drop it too, we don't
need it for a new approach.

This reverts commit be64bbb014.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-13-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
a2c37a3042 Revert "block: Restructure remove_file_or_backing_child()"
That's a preparation to previously reverted
"block: Let replace_child_noperm free children". Drop it too, we don't
need it for a new approach.

This reverts commit 562bda8bb4.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-12-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
0f0b1e29d3 Revert "block: Let replace_child_tran keep indirect pointer"
That's a preparation to previously reverted
"block: Let replace_child_noperm free children". Drop it too, we don't
need it for a new approach.

This reverts commit 82b54cf516.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-11-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
4eba825a82 Revert "block: Let replace_child_noperm free children"
We are going to reimplement this behavior (clear bs->file / bs->backing
pointers automatically when child->bs is cleared) in a nicer way, see
further commit
"block: Manipulate bs->file / bs->backing pointers in .attach/.detach".

With this revert we bring back a problem that was fixed by b0a9f6fed3.
Still the problem was mostly theoretical, we don't have concrete bugs
fixed by b0a9f6fed3, we don't have a specific test. Probably some
accidental failures of iotests are related.

Alternatively, we may merge this and following three reverts into final
"block: Manipulate ..." to avoid any kind of regression. But seems that
in this case having separate clear revert commits is better.

This reverts commit b0a9f6fed3.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-10-vsementsov@yandex-team.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Vladimir Sementsov-Ogievskiy
8393078032 block: introduce bdrv_open_file_child() helper
Almost all drivers call bdrv_open_child() similarly. Let's create a
helper for this.

The only not updated drivers that call bdrv_open_child() to set
bs->file are raw-format and snapshot-access:
    raw-format sometimes want to have filtered child but
        don't set drv->is_filter to true.
    snapshot-access wants only DATA | PRIMARY

Possibly we should implement drv->is_filter_func() handler, to consider
raw-format as filter when it works as filter.. But it's another story.

Note also, that we decrease assignments to bs->file in code: it helps
us restrict modifying this field in further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220726201134.924743-3-vsementsov@yandex-team.ru>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:14:11 +02:00
Bin Meng
69fbfff95e block: Refactor get_tmp_filename()
At present there are two callers of get_tmp_filename() and they are
inconsistent.

One does:

    /* TODO: extra byte is a hack to ensure MAX_PATH space on Windows. */
    char *tmp_filename = g_malloc0(PATH_MAX + 1);
    ...
    ret = get_tmp_filename(tmp_filename, PATH_MAX + 1);

while the other does:

    s->qcow_filename = g_malloc(PATH_MAX);
    ret = get_tmp_filename(s->qcow_filename, PATH_MAX);

As we can see different 'size' arguments are passed. There are also
platform specific implementations inside the function, and the use
of snprintf is really undesirable.

The function name is also misleading. It creates a temporary file,
not just a filename.

Refactor this routine by changing its name and signature to:

    char *create_tmp_file(Error **errp)

and use g_get_tmp_dir() / g_mkstemp() for a consistent implementation.

While we are here, add some comments to mention that /var/tmp is
preferred over /tmp on non-win32 hosts.

Signed-off-by: Bin Meng <bin.meng@windriver.com>
Message-Id: <20221010040432.3380478-2-bin.meng@windriver.com>
[kwolf: Fixed incorrect errno negation and iotest 051]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 20:13:32 +02:00
Bin Meng
6b6471eee1 block: Ignore close() failure in get_tmp_filename()
The temporary file has been created and is ready for use. Checking
return value of close() does not seem useful. The file descriptor
is almost certainly closed; see close(2) under "Dealing with error
returns from close()".

Let's simply ignore close() failure here.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221010040432.3380478-1-bin.meng@windriver.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-27 18:33:32 +02:00
Stefan Hajnoczi
e8b6535533 block: add BDRV_REQ_REGISTERED_BUF request flag
Block drivers may optimize I/O requests accessing buffers previously
registered with bdrv_register_buf(). Checking whether all elements of a
request's QEMUIOVector are within previously registered buffers is
expensive, so we need a hint from the user to avoid costly checks.

Add a BDRV_REQ_REGISTERED_BUF request flag to indicate that all
QEMUIOVector elements in an I/O request are known to be within
previously registered buffers.

Always pass the flag through to driver read/write functions. There is
little harm in passing the flag to a driver that does not use it.
Passing the flag to drivers avoids changes across many block drivers.
Filter drivers would need to explicitly support the flag and pass
through to their children when the children support it. That's a lot of
code changes and it's hard to remember to do that everywhere, leading to
silent reduced performance when the flag is accidentally dropped.

The only problematic scenario with the approach in this patch is when a
driver passes the flag through to internal I/O requests that don't use
the same I/O buffer. In that case the hint may be set when it should
actually be clear. This is a rare case though so the risk is low.

Some drivers have assert(!flags), which no longer works when
BDRV_REQ_REGISTERED_BUF is passed in. These assertions aren't very
useful anyway since the functions are called almost exclusively by
bdrv_driver_preadv/pwritev() so if we get flags handling right there
then the assertion is not needed.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20221013185908.1297568-7-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26 14:56:42 -04:00
Emanuele Giuseppe Esposito
880eeec613 jobs: group together API calls under the same job lock
Now that the API offers also _locked() functions, take advantage
of it and give also the caller control to take the lock and call
_locked functions.

This makes sense especially when we have for loops, because it
makes no sense to have:

for(job = job_next(); ...)

where each job_next() takes the lock internally.
Instead we want

JOB_LOCK_GUARD();
for(job = job_next_locked(); ...)

In addition, protect also direct field accesses, by either creating a
new critical section or widening the existing ones.

Note: at this stage, job_{lock/unlock} and job lock guard macros
are *nop*.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Message-Id: <20220926093214.506243-12-eesposit@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-07 12:11:41 +02:00
Paolo Bonzini
881a4c553c block: add missing coroutine_fn annotations
Callers of coroutine_fn must be coroutine_fn themselves, or the call
must be within "if (qemu_in_coroutine())".  Apply coroutine_fn to
functions where this holds.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220922084924.201610-3-pbonzini@redhat.com>
[kwolf: Fixed up coding style]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-10-07 12:11:40 +02:00
Alberto Faria
a9262f551e block: Change blk_{pread,pwrite}() param order
Swap 'buf' and 'bytes' around for consistency with
blk_co_{pread,pwrite}(), and in preparation to implement these functions
using generated_co_wrapper.

Callers were updated using this Coccinelle script:

    @@ expression blk, offset, buf, bytes, flags; @@
    - blk_pread(blk, offset, buf, bytes, flags)
    + blk_pread(blk, offset, bytes, buf, flags)

    @@ expression blk, offset, buf, bytes, flags; @@
    - blk_pwrite(blk, offset, buf, bytes, flags)
    + blk_pwrite(blk, offset, bytes, buf, flags)

It had no effect on hw/block/nand.c, presumably due to the #if, so that
file was updated manually.

Overly-long lines were then fixed by hand.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-4-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-07-12 12:14:56 +02:00
Alberto Faria
3b35d4542c block: Add a 'flags' param to blk_pread()
For consistency with other I/O functions, and in preparation to
implement it using generated_co_wrapper.

Callers were updated using this Coccinelle script:

    @@ expression blk, offset, buf, bytes; @@
    - blk_pread(blk, offset, buf, bytes)
    + blk_pread(blk, offset, buf, bytes, 0)

It had no effect on hw/block/nand.c, presumably due to the #if, so that
file was updated manually.

Overly-long lines were then fixed by hand.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220705161527.1054072-3-afaria@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-07-12 12:14:56 +02:00
Alberto Faria
bf5b16fa40 block: Make blk_{pread,pwrite}() return 0 on success
They currently return the value of their 'bytes' parameter on success.

Make them return 0 instead, for consistency with other I/O functions and
in preparation to implement them using generated_co_wrapper. This also
makes it clear that short reads/writes are not possible.

Signed-off-by: Alberto Faria <afaria@redhat.com>
Message-Id: <20220705161527.1054072-2-afaria@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2022-07-12 12:14:56 +02:00
Hanna Reitz
15aee7ac95 block: Classify bdrv_get_flags() as I/O function
This function is safe to call in an I/O context, and qcow2_do_open()
does so (invoked in an I/O context by qcow2_co_invalidate_cache()).

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220427114057.36651-2-hreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-05-04 15:55:23 +02:00
Marc-André Lureau
8e3b0cbb72 Replace qemu_real_host_page variables with inlined functions
Replace the global variables with inlined helper functions. getpagesize() is very
likely annotated with a "const" function attribute (at least with glibc), and thus
optimization should apply even better.

This avoids the need for a constructor initialization too.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220323155743.1585078-12-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-06 10:50:38 +02:00
Emanuele Giuseppe Esposito
f0c2832703 block_int-common.h: assertions in the callers of BdrvChildClass function pointers
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-29-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
da359909bd block_int-common.h: assertions in the callers of BlockDriver function pointers
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-27-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
1581a70ddd block/coroutines: I/O and "I/O or GS" API
block coroutines functions run in different aiocontext, and are
not protected by the BQL. Therefore are I/O.

On the other side, generated_co_wrapper functions use BDRV_POLL_WHILE,
meaning the caller can either be the main loop or a specific iothread.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-25-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
bdb734763b block.c: add assertions to static functions
Following the assertion derived from the API split,
propagate the assertion also in the static functions.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-18-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
696bf4c78c block: introduce assert_bdrv_graph_writable
We want to be sure that the functions that write the child and
parent list of a bs are under BQL and drain.

BQL prevents from concurrent writings from the GS API, while
drains protect from I/O.

TODO: drains are missing in some functions using this assert.
Therefore a proper assertion will fail. Because adding drains
requires additional discussions, they will be added in future
series.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-15-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
967d7905d1 IO_CODE and IO_OR_GS_CODE for block_int I/O API
Mark all I/O functions with IO_CODE, and all "I/O OR GS" with
IO_OR_GS_CODE.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-14-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
b4ad82aab1 assertions for block_int global state API
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-13-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
862fded928 block.c: assertions to the block layer permissions API
Now that we "covered" the three main cases where the
permission API was being used under BQL (fuse,
amend and invalidate_cache), we can safely assert for
the permission functions implemented in block.c

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-11-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
384a48fb74 IO_CODE and IO_OR_GS_CODE for block I/O API
Mark all I/O functions with IO_CODE, and all "I/O OR GS" with
IO_OR_GS_CODE.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-6-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
f791bf7f93 assertions for block global state API
All the global state (GS) API functions will check that
qemu_in_main_thread() returns true. If not, it means
that the safety of BQL cannot be guaranteed, and
they need to be moved to I/O.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-5-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
3b491a9056 include/block/block: split header into I/O and global state API
block.h currently contains a mix of functions:
some of them run under the BQL and modify the block layer graph,
others are instead thread-safe and perform I/O in iothreads.
Some others can only be called by either the main loop or the
iothread running the AioContext (and not other iothreads),
and using them in another thread would cause deadlocks, and therefore
it is not ideal to define them as I/O.

It is not easy to understand which function is part of which
group (I/O vs GS vs "I/O or GS"), and this patch aims to clarify it.

The "GS" functions need the BQL, and often use
aio_context_acquire/release and/or drain to be sure they
can modify the graph safely.
The I/O function are instead thread safe, and can run in
any AioContext.
"I/O or GS" functions run instead in the main loop or in
a single iothread, and use BDRV_POLL_WHILE().

By splitting the header in two files, block-io.h
and block-global-state.h we have a clearer view on what
needs what kind of protection. block-common.h
contains common structures shared by both headers.

block.h is left there for legacy and to avoid changing
all includes in all c files that use the block APIs.

Assertions are added in the next patch.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220303151616.325444-4-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:18:25 +01:00
Emanuele Giuseppe Esposito
11d0c9b37e block: move BQL logic of bdrv_co_invalidate_cache in bdrv_activate
Split bdrv_co_invalidate cache in two: the Global State (under BQL)
code that takes care of permissions and running GS callbacks,
and leave only the I/O code (->bdrv_co_invalidate_cache) running in
the I/O coroutine.

The only side effect is that bdrv_co_invalidate_cache is not
recursive anymore, and so is every direct call to
bdrv_invalidate_cache().

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20220209105452.1694545-6-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:14:40 +01:00
Emanuele Giuseppe Esposito
3b71719462 block: rename bdrv_invalidate_cache_all, blk_invalidate_cache and test_sync_op_invalidate_cache
Following the bdrv_activate renaming, change also the name
of the respective callers.

bdrv_invalidate_cache_all -> bdrv_activate_all
blk_invalidate_cache -> blk_activate
test_sync_op_invalidate_cache -> test_sync_op_activate

No functional change intended.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220209105452.1694545-5-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:14:40 +01:00
Emanuele Giuseppe Esposito
a94750d956 block: introduce bdrv_activate
This function is currently just a wrapper for bdrv_invalidate_cache(),
but in future will contain the code of bdrv_co_invalidate_cache() that
has to always be protected by BQL, and leave the rest in the I/O
coroutine.

Replace all bdrv_invalidate_cache() invokations with bdrv_activate().

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20220209105452.1694545-4-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-03-04 18:14:40 +01:00
Vladimir Sementsov-Ogievskiy
c0829cb1fd block: bdrv_set_backing_hd(): use drained section
Graph modifications should be done in drained section. stream_prepare()
handler of block stream job call bdrv_set_backing_hd() without using
drained section and it's theoretically possible that some IO request
will interleave with graph modification and will use outdated pointers
to removed block nodes.

Some other callers use bdrv_set_backing_hd() not caring about drained
sections too. So it seems good to make a drained section exactly in
bdrv_set_backing_hd().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20220124173741.2984056-1-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-02-01 13:49:15 +01:00
Vladimir Sementsov-Ogievskiy
64631f3681 block: drop BLK_PERM_GRAPH_MOD
First, this permission never protected a node from being changed, as
generic child-replacing functions don't check it.

Second, it's a strange thing: it presents a permission of parent node
to change its child. But generally, children are replaced by different
mechanisms, like jobs or qmp commands, not by nodes.

Graph-mod permission is hard to understand. All other permissions
describe operations which done by parent node on its child: read,
write, resize. Graph modification operations are something completely
different.

The only place where BLK_PERM_GRAPH_MOD is used as "perm" (not shared
perm) is mirror_start_job, for s->target. Still modern code should use
bdrv_freeze_backing_chain() to protect from graph modification, if we
don't do it somewhere it may be considered as a bug. So, it's a bit
risky to drop GRAPH_MOD, and analyzing of possible loss of protection
is hard. But one day we should do it, let's do it now.

One more bit of information is that locking the corresponding byte in
file-posix doesn't make sense at all.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210902093754.2352-1-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-01-14 12:03:16 +01:00
Emanuele Giuseppe Esposito
fa8fc1d09f block_int: make bdrv_backing_overridden static
bdrv_backing_overridden is only used in block.c, so there is
no need to leave it in block_int.h

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20211215121140.456939-2-eesposit@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-01-14 12:03:16 +01:00
Hanna Reitz
b0a9f6fed3 block: Let replace_child_noperm free children
In most of the block layer, especially when traversing down from other
BlockDriverStates, we assume that BdrvChild.bs can never be NULL.  When
it becomes NULL, it is expected that the corresponding BdrvChild pointer
also becomes NULL and the BdrvChild object is freed.

Therefore, once bdrv_replace_child_noperm() sets the BdrvChild.bs
pointer to NULL, it should also immediately set the corresponding
BdrvChild pointer (like bs->file or bs->backing) to NULL.

In that context, it also makes sense for this function to free the
child.  Sometimes we cannot do so, though, because it is called in a
transactional context where the caller might still want to reinstate the
child in the abort branch (and free it only on commit), so this behavior
has to remain optional.

In bdrv_replace_child_tran()'s abort handler, we now rely on the fact
that the BdrvChild passed to bdrv_replace_child_tran() must have had a
non-NULL .bs pointer initially.  Make a note of that and assert it.

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20211111120829.81329-10-hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211115145409.176785-10-kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-11-16 09:43:46 +01:00
Hanna Reitz
82b54cf516 block: Let replace_child_tran keep indirect pointer
As of a future commit, bdrv_replace_child_noperm() will clear the
indirect BdrvChild pointer passed to it if the new child BDS is NULL.
bdrv_replace_child_tran() will want to let it do that, but revert this
change in its abort handler.  For that, we need to have it receive a
BdrvChild ** pointer, too, and keep it stored in the
BdrvReplaceChildState object that we attach to the transaction.

Note that we do not need to store it in the BdrvReplaceChildState when
new_bs is not NULL, because then there is nothing to revert.  This is
important so that bdrv_replace_node_noperm() can pass a pointer to a
loop-local variable to bdrv_replace_child_tran() without worrying that
this pointer will outlive one loop iteration.

(Of course, for that to work, bdrv_replace_node_noperm() and in turn
bdrv_replace_node() and its relatives may not be called with a NULL @to
node.  Luckily, they already are not, but now we should assert this.)

bdrv_remove_file_or_backing_child() on the other hand needs to ensure
that the indirect pointer it passes will stay valid for the duration of
the transaction.  Ensure this by keeping a strong reference to the BDS
whose &bs->backing or &bs->file it passes to bdrv_replace_child_tran(),
and giving up that reference only in the transaction .clean() handler.

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20211111120829.81329-9-hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211115145409.176785-9-kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-11-16 09:43:45 +01:00
Hanna Reitz
562bda8bb4 block: Restructure remove_file_or_backing_child()
As of a future patch, bdrv_replace_child_tran() will take a BdrvChild **
pointer.  Prepare for that by getting such a pointer and using it where
applicable, and (dereferenced) as a parameter for
bdrv_replace_child_tran().

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20211111120829.81329-7-hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211115145409.176785-7-kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-11-16 09:43:42 +01:00
Hanna Reitz
be64bbb014 block: Pass BdrvChild ** to replace_child_noperm
bdrv_replace_child_noperm() modifies BdrvChild.bs, and can potentially
set it to NULL.  That is dangerous, because BDS parents generally assume
that their children's .bs pointer is never NULL.  We therefore want to
let bdrv_replace_child_noperm() set the corresponding BdrvChild pointer
to NULL, too.

This patch lays the foundation for it by passing a BdrvChild ** pointer
to bdrv_replace_child_noperm() so that it can later use it to NULL the
BdrvChild pointer immediately after setting BdrvChild.bs to NULL.

(We will still need to undertake some intermediate steps, though.)

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20211111120829.81329-6-hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211115145409.176785-6-kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-11-16 09:43:41 +01:00
Hanna Reitz
2651806141 block: Drop detached child from ignore list
bdrv_attach_child_common_abort() restores the parent's AioContext.  To
do so, the child (which was supposed to be attached, but is now detached
again by this abort handler) is added to the ignore list for the
AioContext changing functions.

However, since we modify a BDS's children list in the BdrvChildClass's
.attach and .detach handlers, the child is already effectively detached
from the parent by this point.  We do not need to put it into the ignore
list.

Use this opportunity to clean up the empty line structure: Keep setting
the ignore list, invoking the AioContext function, and freeing the
ignore list in blocks separated by empty lines.

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211111120829.81329-5-hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211115145409.176785-5-kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-11-16 09:43:39 +01:00
Hanna Reitz
04c9c3a52c block: Unite remove_empty_child and child_free
Now that bdrv_remove_empty_child() no longer removes the child from the
parent's children list but only checks that it is not in such a list, it
is only a wrapper around bdrv_child_free() that checks that the child is
empty and unused.  That should apply to all children that we free, so
put those checks into bdrv_child_free() and drop
bdrv_remove_empty_child().

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211111120829.81329-4-hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211115145409.176785-4-kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-11-16 09:43:38 +01:00
Hanna Reitz
a225369bce block: Manipulate children list in .attach/.detach
The children list is specific to BDS parents.  We should not modify it
in the general children modification code, but let BDS parents deal with
it in their .attach() and .detach() methods.

This also has the advantage that a BdrvChild is removed from the
children list before its .bs pointer can become NULL.  BDS parents
generally assume that their children's .bs pointer is never NULL, so
this is actually a bug fix.

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211111120829.81329-3-hreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211115145409.176785-3-kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-11-16 09:43:36 +01:00
Kevin Wolf
bfb8aa6d58 block: Fail gracefully when blockdev-snapshot creates loops
Using blockdev-snapshot to append a node as an overlay to itself, or to
any of its parents, causes crashes. Catch the condition and return an
error for these cases instead.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1824363
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211018134714.48438-1-kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-11-02 13:02:46 +01:00
Vladimir Sementsov-Ogievskiy
b11c8739ae block: bdrv_insert_node(): don't use bdrv_open()
Use bdrv_new_open_driver_opts() instead of complicated bdrv_open().

Among other extra things bdrv_open() also check for white-listed
formats, which we don't want for internal node creation: currently
backup doesn't work when copy-before-write filter is not white-listed.
As well block-stream doesn't work when copy-on-read is not
white-listed.

Fixes: 751cec7a26
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=2004812
Reported-by: Yanan Fu
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210920115538.264372-5-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-06 10:25:55 +02:00
Vladimir Sementsov-Ogievskiy
96796fae6f block: bdrv_insert_node(): doc and style
- options & flags is common pair for open-like functions, let's use it
 - add a comment that specifies use of @options

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210920115538.264372-4-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-06 10:25:55 +02:00
Vladimir Sementsov-Ogievskiy
f053b7e800 block: bdrv_insert_node(): fix and improve error handling
- use ERRP_GUARD(): function calls error_prepend(), so it must use
   ERRP_GUARD(), otherwise error_prepend() would not be called when
   passed errp is error_fatal

 - drop error propagation, handle return code instead

 - for symmetry, do error_prepend() for the second failure

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210920115538.264372-3-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-06 10:25:55 +02:00
Vladimir Sementsov-Ogievskiy
621d17378a block: implement bdrv_new_open_driver_opts()
Add version of bdrv_new_open_driver() that supports QDict options.
We'll use it in further commit.

Simply add one more argument to bdrv_new_open_driver() is worse, as
there are too many invocations of bdrv_new_open_driver() to update
then.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210920115538.264372-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-10-06 10:25:55 +02:00
Vladimir Sementsov-Ogievskiy
a13de40a05 block: bdrv_inactivate_recurse(): check for permissions and fix crash
We must not inactivate child when parent has write permissions on
it.

Calling .bdrv_inactivate() doesn't help: actually only qcow2 has this
handler and it is used to flush caches, not for permission
manipulations.

So, let's simply check cumulative parent permissions before
inactivating the node.

This commit fixes a crash when we do migration during backup: prior to
the commit nothing prevents all nodes inactivation at migration finish
and following backup write to the target crashes on assertion
"assert(!(bs->open_flags & BDRV_O_INACTIVE));" in
bdrv_co_write_req_prepare().

After the commit, we rely on the fact that copy-before-write filter
keeps write permission on target node to be able to write to it. So
inactivation fails and migration fails as expected.

Corresponding test now passes, so, enable it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20210911120027.8063-3-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-09-15 15:54:07 +02:00
Hanna Reitz
0bc329fbb0 block: block-status cache for data regions
As we have attempted before
(https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg06451.html,
"file-posix: Cache lseek result for data regions";
https://lists.nongnu.org/archive/html/qemu-block/2021-02/msg00934.html,
"file-posix: Cache next hole"), this patch seeks to reduce the number of
SEEK_DATA/HOLE operations the file-posix driver has to perform.  The
main difference is that this time it is implemented as part of the
general block layer code.

The problem we face is that on some filesystems or in some
circumstances, SEEK_DATA/HOLE is unreasonably slow.  Given the
implementation is outside of qemu, there is little we can do about its
performance.

We have already introduced the want_zero parameter to
bdrv_co_block_status() to reduce the number of SEEK_DATA/HOLE calls
unless we really want zero information; but sometimes we do want that
information, because for files that consist largely of zero areas,
special-casing those areas can give large performance boosts.  So the
real problem is with files that consist largely of data, so that
inquiring the block status does not gain us much performance, but where
such an inquiry itself takes a lot of time.

To address this, we want to cache data regions.  Most of the time, when
bad performance is reported, it is in places where the image is iterated
over from start to end (qemu-img convert or the mirror job), so a simple
yet effective solution is to cache only the current data region.

(Note that only caching data regions but not zero regions means that
returning false information from the cache is not catastrophic: Treating
zeroes as data is fine.  While we try to invalidate the cache on zero
writes and discards, such incongruences may still occur when there are
other processes writing to the image.)

We only use the cache for nodes without children (i.e. protocol nodes),
because that is where the problem is: Drivers that rely on block-status
implementations outside of qemu (e.g. SEEK_DATA/HOLE).

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/307
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20210812084148.14458-3-hreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[hreitz: Added `local_file == bs` assertion, as suggested by Vladimir]
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-09-15 15:54:06 +02:00
Vladimir Sementsov-Ogievskiy
bd8f4c42c8 block: introduce bdrv_replace_child_bs()
Add function to transactionally replace bs inside BdrvChild.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210824083856.17408-2-vsementsov@virtuozzo.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-09-01 12:57:31 +02:00
Kevin Wolf
e5f05f8c37 block: Add option to use driver whitelist even in tools
Currently, the block driver whitelists are only applied for the system
emulator. All other binaries still give unrestricted access to all block
drivers. There are use cases where this made sense because the main
concern was avoiding customers running VMs on less optimised block
drivers and getting bad performance. Allowing the same image format e.g.
as a target for 'qemu-img convert' is not a problem then.

However, if the concern is the supportability of the driver in general,
either in full or when used read-write, not applying the list driver
whitelist in tools doesn't help - especially since qemu-nbd and
qemu-storage-daemon now give access to more or less the same operations
in block drivers as running a system emulator.

In order to address this, introduce a new configure option that enforces
the driver whitelist in all binaries.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210709164141.254097-1-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-07-20 13:14:45 +02:00
Kevin Wolf
6cf42ca2f9 block: Acquire AioContexts during bdrv_reopen_multiple()
As the BlockReopenQueue can contain nodes in multiple AioContexts, only
one of which may be locked when AIO_WAIT_WHILE() can be called, we can't
let the caller lock the right contexts. Instead, individually lock the
AioContext of a single node when iterating the queue.

Reintroduce bdrv_reopen() as a wrapper for reopening a single node that
drains the node and temporarily drops the AioContext lock for
bdrv_reopen_multiple().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210708114709.206487-4-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-07-09 13:19:11 +02:00
Alberto Garcia
ab5b522879 block: Add bdrv_reopen_queue_free()
Move the code to free a BlockReopenQueue to a separate function.
It will be used in a subsequent patch.

[ kwolf: Also free explicit_options and options, and explicitly
  qobject_ref() the value when it continues to be used. This makes
  future memory leaks less likely. ]

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210708114709.206487-3-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-07-09 13:19:11 +02:00
Eric Blake
497a30dbb0 qemu-img: Require -F with -b backing image
Back in commit d9f059aa6c (qemu-img: Deprecate use of -b without -F),
we deprecated the ability to create a file with a backing image that
requires qemu to perform format probing.  Qemu can still probe older
files for backwards compatibility, but it is time to finish off the
ability to create such images, due to the potential security risk they
present.  Update a couple of iotests affected by the change.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210503213600.569128-3-eblake@redhat.com>
Reviewed-by: Connor Kuehl <ckuehl@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-07-09 13:18:32 +02:00
Alberto Garcia
ecd30d2d97 block: Allow changing bs->file on reopen
When the x-blockdev-reopen was added it allowed reconfiguring the
graph by replacing backing files, but changing the 'file' option was
forbidden. Because of this restriction some operations are not
possible, notably inserting and removing block filters.

This patch adds support for replacing the 'file' option. This is
similar to replacing the backing file and the user is likewise
responsible for the correctness of the resulting graph, otherwise this
can lead to data corruption.

Signed-off-by: Alberto Garcia <berto@igalia.com>
 [vsementsov: bdrv_reopen_parse_file_or_backing() is modified a lot]
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610120537.196183-9-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:00 +02:00
Vladimir Sementsov-Ogievskiy
3d0e8743f0 block: BDRVReopenState: drop replace_backing_bs field
It's used only in bdrv_reopen_commit(). "backing" is covered by the
loop through all children except for case when we removed backing child
during reopen.

Make it more obvious and drop extra boolean field: qdict_del will not
fail if there is no such entry.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610120537.196183-8-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:00 +02:00
Vladimir Sementsov-Ogievskiy
25f78d9e2d block: move supports_backing check to bdrv_set_file_or_backing_noperm()
Move supports_backing check of bdrv_reopen_parse_backing to called
(through bdrv_set_backing_noperm()) bdrv_set_file_or_backing_noperm()
function. The check applies to general case, so it's appropriate for
bdrv_set_file_or_backing_noperm().

We have to declare backing support for two test drivers, otherwise new
check fails.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610120537.196183-7-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:00 +02:00
Vladimir Sementsov-Ogievskiy
cbfdb98ce2 block: bdrv_reopen_parse_backing(): simplify handling implicit filters
The logic around finding overlay here is not obvious. Actually it does
two simple things:

1. If new bs is already in backing chain, split from parent bs by
   several implicit filters we are done, do nothing.

2. Otherwise, don't try to replace implicit filter.

Let's rewrite this in more obvious way.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610120537.196183-6-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:00 +02:00
Vladimir Sementsov-Ogievskiy
bfae052a57 block: bdrv_reopen_parse_backing(): don't check frozen child
bdrv_set_backing_noperm() takes care of it (actual check is in
bdrv_set_file_or_backing_noperm()), so we don't need to check it here.

While being here, improve error message a bit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610120537.196183-5-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:00 +02:00
Vladimir Sementsov-Ogievskiy
fd26b8a089 block: bdrv_reopen_parse_backing(): don't check aio context
We don't need this check: bdrv_set_backing_noperm() will do it anyway
(actually in bdrv_attach_child_common()).

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610120537.196183-4-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:00 +02:00
Vladimir Sementsov-Ogievskiy
e9238278c2 block: introduce bdrv_set_file_or_backing_noperm()
To be used for reopen in future commit.

Notes:
 - It seems OK to update inherits_from if new bs is recursively inherits
 from parent bs. Let's just not check for backing_chain_contains, to
 support file child of non-filters.

 - Simply check child->frozen instead of
   bdrv_is_backing_chain_frozen(), as we really interested only in this
   one child.

 - Role determination of new child is a bit more complex: it remains
   the same for backing child, it's obvious for filter driver. But for
   non-filter file child let's for now restrict to only replacing
   existing child (and keeping its role).

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610120537.196183-3-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:00 +02:00
Vladimir Sementsov-Ogievskiy
5b9950193b block: introduce bdrv_remove_file_or_backing_child()
To be used for reopen in future commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610120537.196183-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:00 +02:00
Vladimir Sementsov-Ogievskiy
7ec390d587 block: comment graph-modifying function not updating permissions
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210610112618.127378-3-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:00 +02:00
Vladimir Sementsov-Ogievskiy
4bf021dbd5 block: rename bdrv_replace_child to bdrv_replace_child_tran
We have bdrv_replace_child() wrapper on bdrv_replace_child_noperm().
But bdrv_replace_child() doesn't update permissions. It's rather
strange, as normally it's expected that foo() should call foo_noperm()
and update permissions.

Let's rename and add comment.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210610112618.127378-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:00 +02:00
Max Reitz
d5b2399458 block: BDRV_O_NO_IO for backing file on creation
When creating an image file with a backing file, we generally try to
open the backing file (unless -u was specified), mostly to verify that
it is there, but also to get the file size if none was specified for the
new image.

For neither of these things do we need data I/O, and so we can pass
BDRV_O_NO_IO when opening the backing file.  This allows us to open even
encrypted backing images without requiring the user to provide a secret.

This makes the -u switch in iotests 189 and 198 unnecessary (and the
$size parameter), so drop it, because this way we get regression tests
for this patch here.

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/441
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210622140030.212487-1-mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:00 +02:00
Miroslav Rezanina
2d369d6e6e Prevent compiler warning on block.c
Commit 3108a15cf (block: introduce bdrv_drop_filter()) introduced
uninitialized variable to_cow_parent in bdrv_replace_node_common
function that is used only when detach_subchain is true. It is used in
two places. First if block properly initialize the variable and second
block use it.

However, compiler may treat these two blocks as two independent cases so
it thinks first block can fail test and second one pass (although both
use same condition). This cause warning that variable can be
uninitialized in second block.

The warning was observed with GCC 8.4.1 and 11.0.1.

To prevent this warning, initialize the variable with NULL.

Signed-off-by: Miroslav Rezanina <mrezanin@redhat.com>
Message-Id: <1162368493.17178530.1620201543649.JavaMail.zimbra@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-29 16:51:00 +02:00
Joelle van Dyne
feccdceed2 block: check for sys/disk.h
Some BSD platforms do not have this header.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Joelle van Dyne <j@getutm.app>
Message-Id: <20210315180341.31638-3-j@getutm.app>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 10:54:13 +02:00
Vladimir Sementsov-Ogievskiy
30ebb9aa92 block: improve permission conflict error message
Now permissions are updated as follows:
 1. do graph modifications ignoring permissions
 2. do permission update

 (of course, we rollback [1] if [2] fails)

So, on stage [2] we can't say which users are "old" and which are
"new" and exist only since [1]. And current error message is a bit
outdated. Let's improve it, to make everything clean.

While being here, add also a comment and some good assertions.

iotests 283, 307, qsd-jobs outputs are updated.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210601075218.79249-7-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-02 14:23:20 +02:00
Vladimir Sementsov-Ogievskiy
da261b69ae block: simplify bdrv_child_user_desc()
All child classes have this callback. So, drop unreachable code.

Still add an assertion to bdrv_attach_child_common(), to early detect
bad classes.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210601075218.79249-6-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-02 14:23:20 +02:00
Vladimir Sementsov-Ogievskiy
2c0a3acb95 block: improve bdrv_child_get_parent_desc()
We have different types of parents: block nodes, block backends and
jobs. So, it makes sense to specify type together with name.

Next, this handler us used to compose an error message about permission
conflict. And permission conflict occurs in a specific place of block
graph. We shouldn't report name of parent device (as it refers another
place in block graph), but exactly and only the name of the node. So,
use bdrv_get_node_name() directly.

iotest 283 output is updated.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-Id: <20210601075218.79249-4-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-02 14:23:20 +02:00
Vladimir Sementsov-Ogievskiy
f8d2ad7881 block: document child argument of bdrv_attach_child_common()
The logic around **child is not obvious: this reference is used not
only to return resulting child, but also to rollback NULL value on
transaction abort.

So, let's add documentation and some assertions.

While being here, drop extra declaration of bdrv_attach_child_noperm().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210601075218.79249-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-02 14:23:20 +02:00
Vladimir Sementsov-Ogievskiy
975da07374 block: drop BlockDriverState::read_only
This variable is just a cache for !(bs->open_flags & BDRV_O_RDWR),
which we have to synchronize everywhere. Let's just drop it and
consistently use bdrv_is_read_only().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210527154056.70294-3-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-02 14:23:20 +02:00
Vladimir Sementsov-Ogievskiy
307261b243 block: consistently use bdrv_is_read_only()
It's better to use accessor function instead of bs->read_only directly.
In some places use bdrv_is_writable() instead of
checking both BDRV_O_RDWR set and BDRV_O_INACTIVE not set.

In bdrv_open_common() it's a bit strange to add one more variable, but
we are going to drop bs->read_only in the next patch, so new ro local
variable substitutes it here.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210527154056.70294-2-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-02 14:23:20 +02:00
Vladimir Sementsov-Ogievskiy
fb62b58896 block/vvfat: child_vvfat_qcow: add .get_parent_aio_context, fix crash
Commit 3ca1f32257
"block: BdrvChildClass: add .get_parent_aio_context handler" introduced
new handler and commit 228ca37e12
"block: drop ctx argument from bdrv_root_attach_child" made a generic
use of it. But 3ca1f32257 didn't update
child_vvfat_qcow. Fix that.

Before that fix the command

./build/qemu-system-x86_64 -usb -device usb-storage,drive=fat16 \
  -drive file=fat:rw:fat-type=16:"<path of a host folder>",id=fat16,format=raw,if=none

crashes:

1  bdrv_child_get_parent_aio_context (c=0x559d62426d20)
    at ../block.c:1440
2  bdrv_attach_child_common
    (child_bs=0x559d62468190, child_name=0x559d606f9e3d "write-target",
     child_class=0x559d60c58d20 <child_vvfat_qcow>, child_role=3,
     perm=3, shared_perm=4, opaque=0x559d62445690,
     child=0x7ffc74c2acc8, tran=0x559d6246ddd0, errp=0x7ffc74c2ae60)
    at ../block.c:2795
3  bdrv_attach_child_noperm
    (parent_bs=0x559d62445690, child_bs=0x559d62468190,
     child_name=0x559d606f9e3d "write-target",
     child_class=0x559d60c58d20 <child_vvfat_qcow>, child_role=3,
     child=0x7ffc74c2acc8, tran=0x559d6246ddd0, errp=0x7ffc74c2ae60) at
    ../block.c:2855
4  bdrv_attach_child
    (parent_bs=0x559d62445690, child_bs=0x559d62468190,
     child_name=0x559d606f9e3d "write-target",
     child_class=0x559d60c58d20 <child_vvfat_qcow>, child_role=3,
     errp=0x7ffc74c2ae60) at ../block.c:2953
5  bdrv_open_child
    (filename=0x559d62464b80 "/var/tmp/vl.h3TIS4",
     options=0x559d6246ec20, bdref_key=0x559d606f9e3d "write-target",
     parent=0x559d62445690, child_class=0x559d60c58d20
     <child_vvfat_qcow>, child_role=3, allow_none=false,
     errp=0x7ffc74c2ae60) at ../block.c:3351
6  enable_write_target (bs=0x559d62445690, errp=0x7ffc74c2ae60) at
   ../block/vvfat.c:3176
7  vvfat_open (bs=0x559d62445690, options=0x559d6244adb0, flags=155650,
               errp=0x7ffc74c2ae60) at ../block/vvfat.c:1236
8  bdrv_open_driver (bs=0x559d62445690, drv=0x559d60d4f7e0
                     <bdrv_vvfat>, node_name=0x0,
                     options=0x559d6244adb0, open_flags=155650,
                     errp=0x7ffc74c2af70) at ../block.c:1557
9  bdrv_open_common (bs=0x559d62445690, file=0x0,
                     options=0x559d6244adb0, errp=0x7ffc74c2af70) at
...

(gdb) fr 1
 #1  0x0000559d603ea3bf in bdrv_child_get_parent_aio_context
     (c=0x559d62426d20) at ../block.c:1440
1440        return c->klass->get_parent_aio_context(c);
 (gdb) p c->klass
$1 = (const BdrvChildClass *) 0x559d60c58d20 <child_vvfat_qcow>
 (gdb) p c->klass->get_parent_aio_context
$2 = (AioContext *(*)(BdrvChild *)) 0x0

Fixes: 3ca1f32257
Fixes: 228ca37e12
Reported-by: John Arbuckle <programmingkidx@gmail.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210524101257.119377-2-vsementsov@virtuozzo.com>
Tested-by: John Arbuckle <programmingkidx@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-06-02 14:23:20 +02:00
Kevin Wolf
e3fc91aaaa block: Fix Transaction leak in bdrv_reopen_multiple()
Like other error paths, this one needs to call tran_finalize() and clean
up the BlockReopenQueue, too.

Fixes: CID 1452772
Fixes: 72373e40fb
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210503110555.24001-3-kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-05-18 11:08:13 +02:00
Kevin Wolf
e878bb1293 block: Fix Transaction leak in bdrv_root_attach_child()
The error path needs to call tran_finalize(), too.

Fixes: CID 1452773
Fixes: 548a74c0db
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210503110555.24001-2-kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-05-18 11:08:13 +02:00
Vladimir Sementsov-Ogievskiy
ad578c56d5 block: drop write notifiers
They are unused now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210506090621.11848-3-vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2021-05-14 16:14:10 +02:00
Thomas Huth
4c386f8064 Do not include sysemu/sysemu.h if it's not really necessary
Stop including sysemu/sysemu.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-2-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:50 +02:00
Vladimir Sementsov-Ogievskiy
c20555e15f block: refactor bdrv_node_check_perm()
Now, bdrv_node_check_perm() is called only with fresh cumulative
permissions, so its actually "refresh_perm".

Move permission calculation to the function. Also, drop unreachable
error message and rewrite the remaining one to be more generic (as now
we don't know which node is added and which was already here).

Add also Virtuozzo copyright, as big work is done at this point.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-37-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
2fe5ff56f1 block: rename bdrv_replace_child_safe() to bdrv_replace_child()
We don't have bdrv_replace_child(), so it's time for
bdrv_replace_child_safe() to take its place.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-36-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
ecb776bd93 block: refactor bdrv_child_set_perm_safe() transaction action
Old interfaces dropped, nobody directly calls
bdrv_child_set_perm_abort() and bdrv_child_set_perm_commit(), so we can
use personal state structure for the action and stop exploiting
BdrvChild structure. Also, drop "_safe" suffix which is redundant now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-35-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
4954aacea0 block: inline bdrv_replace_child()
bdrv_replace_child() has only one caller, the second argument is
unused. Inline it now. This triggers deletion of some more unused
interfaces.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-34-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
25409807cf block: inline bdrv_check_perm_common()
bdrv_check_perm_common() has only one caller, so no more sense in
"common".

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-33-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
058acc4708 block: drop unused permission update functions
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-32-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
72373e40fb block: bdrv_reopen_multiple: refresh permissions on updated graph
Move bdrv_reopen_multiple to new paradigm of permission update:
first update graph relations, then do refresh the permissions.

We have to modify reopen process in file-posix driver: with new scheme
we don't have prepared permissions in raw_reopen_prepare(), so we
should reconfigure fd in raw_check_perm(). Still this seems more native
and simple anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-31-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
a2aabf8895 block: bdrv_reopen_multiple(): move bdrv_flush to separate pre-prepare
During reopen we may add backing bs from other aio context, which may
lead to changing original context of top bs.

We are going to move graph modification to prepare stage. So, it will
be possible that bdrv_flush() in bdrv_reopen_prepare called on bs in
non-original aio context, which we didn't aquire which leads to crash.

To avoid this problem move bdrv_flush() to be a separate reopen stage
before bdrv_reopen_prepare().

This doesn't seem correct to acquire only one aio context and not all
contexts participating in reopen. But it's not obvious how to do it
correctly, keeping in mind:

 1. rules of bdrv_set_aio_context_ignore() that requires new_context
    lock not being held

 2. possible deadlocks because of holding all (or several?) AioContext
    locks

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-30-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
160333e1fe block: add bdrv_set_backing_noperm() transaction action
Split out no-perm part of bdrv_set_backing_hd() as a separate
transaction action. Note the in case of existing BdrvChild we reuse it,
not recreate, just to do less actions.

We don't need to create extra reference to backing_hd as we don't lose
it in bdrv_attach_child().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-29-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
1e4c797c75 block: make bdrv_refresh_limits() to be a transaction action
To be used in further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-28-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
332b3a175f block: make bdrv_unset_inherits_from to be a transaction action
To be used in the further commit.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-27-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
9397c14fcb block: drop ignore_children for permission update functions
This argument is always NULL. Drop it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-26-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
3108a15cf0 block: introduce bdrv_drop_filter()
Using bdrv_replace_node() for removing filter is not good enough: it
keeps child reference of the filter, which may conflict with original
top node during permission update.

Instead let's create new interface, which will do all graph
modifications first and then update permissions.

Let's modify bdrv_replace_node_common(), allowing it additionally drop
backing chain child link pointing to new node. This is quite
appropriate for bdrv_drop_intermediate() and makes possible to add
new bdrv_drop_filter() as a simple wrapper.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-24-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
46541ee579 block: add bdrv_remove_filter_or_cow transaction action
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-23-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
2272edcfff block: adapt bdrv_append() for inserting filters
bdrv_append is not very good for inserting filters: it does extra
permission update as part of bdrv_set_backing_hd(). During this update
filter may conflict with other parents of top_bs.

Instead, let's first do all graph modifications and after it update
permissions.

append-greedy-filter test-case in test-bdrv-graph-mod is now works, so
move it out of debug option.

Note: bdrv_append() is still only works for backing-child based
filters. It's something to improve later.

Note2: we use the fact that bdrv_append() is used to append new nodes,
without backing child, so we don't need frozen check and inherits_from
logic from bdrv_set_backing_hd().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-22-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00
Vladimir Sementsov-Ogievskiy
117caba9fc block: split out bdrv_replace_node_noperm()
Split part of bdrv_replace_node_common() to be used separately.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-21-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2021-04-30 12:27:48 +02:00