Commit Graph

940 Commits

Author SHA1 Message Date
Vladimir Sementsov-Ogievskiy
d93e572688 block/io: bdrv_pdiscard: support int64_t bytes parameter
This fixes at least one overflow in qcow2_process_discards, which
passes 64bit region length to bdrv_pdiscard where bytes (or sectors in
the past) parameter is int since its introduction in 0b919fae.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-06-04 16:55:58 +02:00
Kevin Wolf
42a65f02f9 block: Remove bdrv_set_aio_context()
All callers of bdrv_set_aio_context() are eliminated now, they have
moved to bdrv_try_set_aio_context() and related safe functions. Remove
bdrv_set_aio_context().

With this, we can now know that the .set_aio_ctx callback must be
present in bdrv_set_aio_context_ignore() because
bdrv_can_set_aio_context() would have returned false previously, so
instead of checking the condition, we can assert it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-06-04 16:55:58 +02:00
Kevin Wolf
132ada80c4 block: Adjust AioContexts when attaching nodes
So far, we only made sure that updating the AioContext of a node
affected the whole subtree. However, if a node is newly attached to a
new parent, we also need to make sure that both the subtree of the node
and the parent are in the same AioContext. This tries to move the new
child node to the parent AioContext and returns an error if this isn't
possible.

BlockBackends now actually apply their AioContext to their root node.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-06-04 15:22:22 +02:00
Kenneth Heitke
3036a626e9 nvme: add Get/Set Feature Timestamp support
Signed-off-by: Kenneth Heitke <kenneth.heitke@intel.com>
Reviewed-by: Klaus Birkelund Jensen <klaus.jensen@cnexlabs.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-06-04 15:22:09 +02:00
Julia Suvorova
2b02fd81de block/linux-aio: Drop unused BlockAIOCB submission method
Callback-based laio_submit() and laio_cancel() were left after
rewriting Linux AIO backend to coroutines in hope that they would be
used in other code that could bypass coroutines. They can be safely
removed because they have not been used since that time.

Signed-off-by: Julia Suvorova <jusual@mail.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-06-04 15:20:41 +02:00
Vladimir Sementsov-Ogievskiy
69f47505ee block: avoid recursive block_status call if possible
drv_co_block_status digs bs->file for additional, more accurate search
for hole inside region, reported as DATA by bs since 5daa74a6eb.

This accuracy is not free: assume we have qcow2 disk. Actually, qcow2
knows, where are holes and where is data. But every block_status
request calls lseek additionally. Assume a big disk, full of
data, in any iterative copying block job (or img convert) we'll call
lseek(HOLE) on every iteration, and each of these lseeks will have to
iterate through all metadata up to the end of file. It's obviously
ineffective behavior. And for many scenarios we don't need this lseek
at all.

However, lseek is needed when we have metadata-preallocated image.

So, let's detect metadata-preallocation case and don't dig qcow2's
protocol file in other cases.

The idea is to compare allocation size in POV of filesystem with
allocations size in POV of Qcow2 (by refcounts). If allocation in fs is
significantly lower, consider it as metadata-preallocation case.

102 iotest changed, as our detector can't detect shrinked file as
metadata-preallocation, which don't seem to be wrong, as with metadata
preallocation we always have valid file length.

Two other iotests have a slight change in their QMP output sequence:
Active 'block-commit' returns earlier because the job coroutine yields
earlier on a blocking operation. This operation is loading the refcount
blocks in qcow2_detect_metadata_preallocation().

Suggested-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-06-04 15:20:41 +02:00
Kevin Wolf
53a7d04185 block: Propagate AioContext change to parents
All block nodes and users in any connected component of the block graph
must be in the same AioContext, so changing the AioContext of one node
must not only change all of its children, but all of its parents (and
in turn their children etc.) as well.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-05-20 17:08:56 +02:00
Kevin Wolf
a3a683c33d block: Make bdrv_attach/detach_aio_context() static
Since commit b97511c7bc, there is no reason for block drivers any more
to call these functions (see the function comment in block_int.h). They
are now just internal helper functions for bdrv_set_aio_context()
and can be made static.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-05-20 17:08:56 +02:00
Kevin Wolf
5d2318499f block: Add bdrv_try_set_aio_context()
Eventually, we want to make sure that all parents and all children of a
node are in the same AioContext as the node itself. This means that
changing the AioContext may fail because one of the other involved
parties (e.g. a guest device that was configured with an iothread)
cannot allow switching to a different AioContext.

Introduce a set of functions that allow to first check whether all
involved nodes can switch to a new context and only then do the actual
switch. The check recursively covers children and parents.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-05-20 17:08:56 +02:00
Peter Maydell
01807c8b0e Miscellaneous patches for 2019-05-13
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJc2RbiAAoJEDhwtADrkYZTfpAP/itXg+X9wRfMeerni6SAkgtz
 knaLtJNC7YvwH7W6TIkSf2QgMrg/nYzIXxxj6V+Y3Vdn4CD93C7ldItWkm33amlA
 SEhREHpAn4F3wg/LsOGYYgpuqF/wrUcZsmzudnslfMd3mM6Q9Q6J3q6mu8n1oRcl
 RRKlk++ElqyRdvmxFhddhPxk797Vuunh76vd3ARUFmzKs2n7CGkeBu+qbk41VqI9
 YtjmWHO6BDY5b01PvjuWPir6n1yJsYcpfo3ZElZvPf5jQHMmO6fGB3SZc/PIWegq
 gAVeoXtwhNm+nywMpIv1wHQMkvRDZW0wrurIQBc4VGpH1Pa90dR9FNVZ8r0OZqPB
 aErPCdC7ED73uzJwzXKTnLxY0XDgdhsAsW7lFggANs6YyewZNcbDaVhZWsopTTK/
 3jBbddIw2RsfHNQgXlFVVzjZJGHBNHxFjAFASCKcapUWQwDKU42kQrS1GqxG56NI
 Lgi8Ce+Q0GsVF4wme96Oa/8EMRfmNvsHMfWQvmqGqA1OACSOf2PSGCeD618A5gq6
 kV6wF4v5HdGFkc0x9Vr5ur7kv3eOhpzFzBM6XJXe3CyqnYrkNuBldkyGZBbrNY7G
 aW5sR26Is4m9i+7159cNB5LmnfQqtsscibkSC0UQiXcuWgevd6cdiF+0r1YuNp7C
 Faa2yPOHs4mHCjUwade9
 =yKfZ
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-misc-2019-05-13' into staging

Miscellaneous patches for 2019-05-13

# gpg: Signature made Mon 13 May 2019 08:04:02 BST
# gpg:                using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-misc-2019-05-13:
  Clean up decorations and whitespace around header guards
  Normalize header guard symbol definition.
  Clean up ill-advised or unusual header guards
  Clean up header guards that don't match their file name
  target/xtensa: Clean up core-isa.h header guards
  linux-user/nios2 linux-user/riscv: Clean up header guards
  authz: Normalize #include "authz/trace.h" to "trace.h"
  Use #include "..." for our own headers, <...> for others
  Clean up includes

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-05-13 13:55:13 +01:00
Markus Armbruster
6834c3f410 Clean up decorations and whitespace around header guards
Cleaned up with scripts/clean-header-guards.pl.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190315145123.28030-9-armbru@redhat.com>
2019-05-13 08:58:55 +02:00
Alberto Garcia
2e11d7562a block: Remove bdrv_read() and bdrv_write()
No one is using these functions anymore, all callers have switched to
the byte-based bdrv_pread() and bdrv_pwrite()

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-05-10 16:45:40 +02:00
Max Reitz
c4e2aff81b qemu-nbd: Look up flag names in array
The existing code to convert flag bits into strings looks a bit strange
now, and if we ever add more flags, it will look even stranger.  Prevent
that from happening by making it look up the flag names in an array.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20190405191635.25740-1-mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2019-05-07 09:43:42 -05:00
Vladimir Sementsov-Ogievskiy
371420e217 block: introduce byte-based io helpers
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-04-30 15:29:00 +02:00
Markus Armbruster
e1ce7d747b block/qapi: Clean up how we print to monitor or stdout
bdrv_snapshot_dump(), bdrv_image_info_specific_dump(),
bdrv_image_info_dump() and their helpers take an fprintf()-like
callback and a FILE * to pass to it.

hmp.c passes monitor_printf() cast to fprintf_function and the current
monitor cast to FILE *.

qemu-img.c and qemu-io-cmds.c pass fprintf and stdout.

The type-punning is technically undefined behaviour, but works in
practice.  Clean up: drop the callback, and call qemu_printf()
instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190417191805.28198-8-armbru@redhat.com>
2019-04-18 22:18:59 +02:00
Markus Armbruster
ac7ff4cf5f qsp: Simplify how qsp_report() prints
qsp_report() takes an fprintf()-like callback and a FILE * to pass to
it.

Its only caller hmp_sync_profile() passes monitor_fprintf() and the
current monitor cast to FILE *.  monitor_fprintf() casts it right
back, and is otherwise identical to monitor_printf().  The
type-punning is ugly.

Drop the callback, and call qemu_printf() instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20190417191805.28198-7-armbru@redhat.com>
2019-04-18 22:18:59 +02:00
Kevin Wolf
738301e117 file-posix: Support BDRV_REQ_NO_FALLBACK for zero writes
We know that the kernel implements a slow fallback code path for
BLKZEROOUT, so if BDRV_REQ_NO_FALLBACK is given, we shouldn't call it.
The other operations we call in the context of .bdrv_co_pwrite_zeroes
should usually be quick, so no modification should be needed for them.
If we ever notice that there are additional problematic cases, we can
still make these conditional as well.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Eric Blake <eblake@redhat.com>
2019-03-26 11:37:51 +01:00
Kevin Wolf
fe0480d629 block: Add BDRV_REQ_NO_FALLBACK
For qemu-img convert, we want an operation that zeroes out the whole
image if this can be done efficiently, but that returns an error
otherwise so we don't write explicit zeroes and immediately overwrite
them with the real data, potentially doubling the amount of data to be
written.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Eric Blake <eblake@redhat.com>
2019-03-26 11:37:51 +01:00
Peter Maydell
523a2a42c3 Pull request
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+ber27ys35W+dsvQfe+BBqr8OQ4FAlyIFSwACgkQfe+BBqr8
 OQ7wghAAm16eCEr57oTO7QXR3y8uVFsKqXBn9cNH6nbrFp2PUQSglwMDKBls1Z5m
 olF23X/JaqSlSmkL9BBuzDZ6Up+kkHKuxPq4/5RKXfiDI0pr3R0eqts0COAlaN9q
 Bew3ipj99m8gzMi2093AW4+Ob0N3658fuDTGLe1M1Uoy7CEg1QJ7rVOBBEui7vIl
 RbZ8l/Zmb4ldNpB3lnE4Nh9ue8fy0RAj3Nai161nCnNeXNF/VzD3Ye8bojSBbnux
 PIMX6/RWmykX4feIf9QP8apDpxX4HkyuPq5EdwT9PD8PwdyXPAXZtsYUNCuNtQuk
 n5VKFVgFYgqUclBeVHmrMYPU4K4iCFQp4/Fua7wzPEC0iG05NiiDv91oVkEJCp3L
 ManHeuGfNLCcXaIntKZhuJl1cK8yMM3yDww6/pPTehrPjcyvKa0NOqhQBExektcD
 R6q7maJRzFaxSxdcs+Zzuog9zESvH1mlJxQCKzeYhAP0kkxInyTELE/Vbx37xuqR
 RFfZYyVQ6x87Q/sxHx4EMiV97WUM8elZOQdSEC/okt5WUUNpgIu0WF9nSQ1VKZ8C
 CZmv5xh9ogfwvB/kOm6IVwNkLvVagJQcLwddORI5LLXLbSIUcuwVSuyMp/7iDtQ/
 hnHkGs2mIJ2JUYbSSNsSJNs6oTurn8eTFCeGoYKJgd9l4QxaThU=
 =ekU+
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/jnsnow/tags/bitmaps-pull-request' into staging

Pull request

# gpg: Signature made Tue 12 Mar 2019 20:23:08 GMT
# gpg:                using RSA key F9B7ABDBBCACDF95BE76CBD07DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>" [full]
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/bitmaps-pull-request: (22 commits)
  tests/qemu-iotests: add bitmap resize test 246
  block/qcow2-bitmap: Allow resizes with persistent bitmaps
  block/qcow2-bitmap: Don't check size for IN_USE bitmap
  docs/interop/qcow2: Improve bitmap flag in_use specification
  bitmaps: Fix typo in function name
  block/dirty-bitmaps: implement inconsistent bit
  block/dirty-bitmaps: disallow busy bitmaps as merge source
  block/dirty-bitmaps: prohibit removing readonly bitmaps
  block/dirty-bitmaps: prohibit readonly bitmaps for backups
  block/dirty-bitmaps: add block_dirty_bitmap_check function
  block/dirty-bitmap: add inconsistent status
  block/dirty-bitmaps: add inconsistent bit
  iotests: add busy/recording bit test to 124
  blockdev: remove unused paio parameter documentation
  block/dirty-bitmaps: move comment block
  block/dirty-bitmaps: unify qmp_locked and user_locked calls
  block/dirty-bitmap: explicitly lock bitmaps with successors
  nbd: change error checking order for bitmaps
  block/dirty-bitmap: change semantics of enabled predicate
  block/dirty-bitmap: remove set/reset assertions against enabled bit
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	tests/qemu-iotests/group
2019-03-13 17:30:34 +00:00
Alberto Garcia
5019aece2a block: Remove the AioContext parameter from bdrv_reopen_multiple()
This parameter has been unused since 1a63a90750

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia
8a2ce0bc1e block: Add a 'mutable_opts' field to BlockDriver
If we reopen a BlockDriverState and there is an option that is present
in bs->options but missing from the new set of options then we have to
return an error unless the driver is able to reset it to its default
value.

This patch adds a new 'mutable_opts' field to BlockDriver. This is
a list of runtime options that can be modified during reopen. If an
option in this list is unspecified on reopen then it must be reset (or
return an error).

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia
cb828c31de block: Allow changing the backing file on reopen
This patch allows the user to change the backing file of an image that
is being reopened. Here's what it does:

 - In bdrv_reopen_prepare(): check that the value of 'backing' points
   to an existing node or is null. If it points to an existing node it
   also needs to make sure that replacing the backing file will not
   create a cycle in the node graph (i.e. you cannot reach the parent
   from the new backing file).

 - In bdrv_reopen_commit(): perform the actual node replacement by
   calling bdrv_set_backing_hd().

There may be temporary implicit nodes between a BDS and its backing
file (e.g. a commit filter node). In these cases bdrv_reopen_prepare()
looks for the real (non-implicit) backing file and requires that the
'backing' option points to it. Replacing or detaching a backing file
is forbidden if there are implicit nodes in the middle.

Although x-blockdev-reopen is meant to be used like blockdev-add,
there's an important thing that must be taken into account: the only
way to set a new backing file is by using a reference to an existing
node (previously added with e.g. blockdev-add).  If 'backing' contains
a dictionary with a new set of options ({"driver": "qcow2", "file": {
... }}) then it is interpreted that the _existing_ backing file must
be reopened with those options.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia
8546632e61 block: Handle child references in bdrv_reopen_queue()
Children in QMP are specified with BlockdevRef / BlockdevRefOrNull,
which can contain a set of child options, a child reference, or
NULL. In optional attributes like "backing" it can also be missing.

Only the first case (set of child options) is being handled properly
by bdrv_reopen_queue(). This patch deals with all the others.

Here's how these cases should be handled when bdrv_reopen_queue() is
deciding what to do with each child of a BlockDriverState:

   1) Set of child options: if the child was implicitly created (i.e
      inherits_from points to the parent) then the options are removed
      from the parent's options QDict and are passed to the child with
      a recursive bdrv_reopen_queue() call. This case was already
      working fine.

   2) Child reference: there's two possibilites here.

      2a) Reference to the current child: if the child was implicitly
          created then it is put in the reopen queue, keeping its
          current set of options (since this was a child reference
          there was no way to specify a different set of options).
          If the child is not implicit then it keeps its current set
          of options but it is not reopened (and therefore does not
          inherit any new option from the parent).

      2b) Reference to a different BDS: the current child is not put
          in the reopen queue at all. Passing a reference to a
          different BDS can be used to replace a child, although at
          the moment no driver implements this, so it results in an
          error. In any case, the current child is not going to be
          reopened (and might in fact disappear if it's replaced)

   3) NULL: This is similar to (2b). Although no driver allows this
      yet it can be used to detach the current child so it should not
      be put in the reopen queue.

   4) Missing option: at the moment "backing" is the only case where
      this can happen. With "blockdev-add", leaving "backing" out
      means that the default backing file is opened. We don't want to
      open a new image during reopen, so we require that "backing" is
      always present. We'll relax this requirement a bit in the next
      patch. If keep_old_opts is true and "backing" is missing then
      this behaves like 2a (the current child is reopened).

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia
077e8e2018 block: Add 'keep_old_opts' parameter to bdrv_reopen_queue()
The bdrv_reopen_queue() function is used to create a queue with
the BDSs that are going to be reopened and their new options. Once
the queue is ready bdrv_reopen_multiple() is called to perform the
operation.

The original options from each one of the BDSs are kept, with the new
options passed to bdrv_reopen_queue() applied on top of them.

For "x-blockdev-reopen" we want a function that behaves much like
"blockdev-add". We want to ignore the previous set of options so that
only the ones actually specified by the user are applied, with the
rest having their default values.

One of the things that we need is a way to tell bdrv_reopen_queue()
whether we want to keep the old set of options or not, and that's what
this patch does. All current callers are setting this new parameter to
true and x-blockdev-reopen will set it to false.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Alberto Garcia
2cad1ebe70 block: Allow freezing BdrvChild links
Our permission system is useful to define what operations are allowed
on a certain block node and includes things like BLK_PERM_WRITE or
BLK_PERM_RESIZE among others.

One of the permissions is BLK_PERM_GRAPH_MOD which allows "changing
the node that this BdrvChild points to". The exact meaning of this has
never been very clear, but it can be understood as "change any of the
links connected to the node". This can be used to prevent changing a
backing link, but it's too coarse.

This patch adds a new 'frozen' attribute to BdrvChild, which forbids
detaching the link from the node it points to, and new API to freeze
and unfreeze a backing chain.

After this change a few functions can fail, so they need additional
checks.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-12 20:30:14 +01:00
Eric Blake
796a3798ab bitmaps: Fix typo in function name
Commit a88b179f introduced the ability to set and query bitmap
persistence, but with an atypical spelling.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20190308205845.25734-1-eblake@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
2019-03-12 12:05:49 -04:00
John Snow
3ae96d6684 block/dirty-bitmaps: add block_dirty_bitmap_check function
Instead of checking against busy, inconsistent, or read only directly,
use a check function with permissions bits that let us streamline the
checks without reproducing them in many places.

Included in this patch are permissions changes that simply add the
inconsistent check to existing permissions call spots, without
addressing existing bugs.

In general, this means that busy+readonly checks become BDRV_BITMAP_DEFAULT,
which checks against all three conditions. busy-only checks become
BDRV_BITMAP_ALLOW_RO.

Notably, remove allows inconsistent bitmaps, so it doesn't follow the pattern.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190301191545.8728-4-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
2019-03-12 12:05:49 -04:00
John Snow
b0f455599d block/dirty-bitmaps: add inconsistent bit
Add an inconsistent bit to dirty-bitmaps that allows us to report a bitmap as
persistent but potentially inconsistent, i.e. if we find bitmaps on a qcow2
that have been marked as "in use".

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190301191545.8728-2-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
2019-03-12 12:05:49 -04:00
John Snow
27a1b301a4 block/dirty-bitmaps: unify qmp_locked and user_locked calls
These mean the same thing now. Unify them and rename the merged call
bdrv_dirty_bitmap_busy to indicate semantically what we are describing,
as well as help disambiguate from the various _locked and _unlocked
versions of bitmap helpers that refer to mutex locks.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190223000614.13894-8-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
2019-03-12 12:05:48 -04:00
John Snow
50a47257f8 block/dirty-bitmaps: rename frozen predicate helper
"Frozen" was a good description a long time ago, but it isn't adequate now.
Rename the frozen predicate to has_successor to make the semantics of the
predicate more clear to outside callers.

In the process, remove some calls to frozen() that no longer semantically
make sense. For bdrv_enable_dirty_bitmap_locked and
bdrv_disable_dirty_bitmap_locked, it doesn't make sense to prohibit QEMU
internals from performing this action when we only wished to prohibit QMP
users from issuing these commands. All of the QMP API commands for bitmap
manipulation already check against user_locked() to prohibit these actions.

Several other assertions really want to check that the bitmap isn't in-use
by another operation -- use the bitmap_user_locked function for this instead,
which presently also checks for has_successor. This leaves some redundant
checks of has_successor through different helpers that are addressed in
forthcoming patches.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20190223000614.13894-3-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
2019-03-12 12:05:48 -04:00
Peter Maydell
e2a18635a4 nbd patches for 2019-03-08
- support TLS client authorization in NBD servers
 - iotest 223 race fix
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABCAAGBQJcgqh3AAoJEKeha0olJ0Nq/bgH/1TXo49gC9SMNcBBHd5vqc6/
 J+eXYQihmGLy7pNkfCBTB0QZz9d7V4tN/N1PAuIfzsHxcQJeyUBwcY7jin2SiTTM
 lfR9NWDY43OE+8tcPSXODyo3mge8g3d1X3vw8/QMX95TDrKQ8SMwAllegCFBKPZs
 T0+Jyfd8oA0NcQz4EPPUL5f2ptLo2slye2ZjbMBn/1WFrYkL+joUYJgyakYcZnY/
 mcvmXF2JLG2fPzFoU1yvF+oZn6J2fx5pw92P+SZ7lA+qRzlWfvrVyK9sNqCS+K5m
 qdfMeeL/SyUPsUvlcbDH7iSjxWkR/h7MtXRq83FHzupasMeXiQ9ieb3MFAtHnGM=
 =5pyZ
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2019-03-08' into staging

nbd patches for 2019-03-08

- support TLS client authorization in NBD servers
- iotest 223 race fix

# gpg: Signature made Fri 08 Mar 2019 17:37:59 GMT
# gpg:                using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full]
# gpg:                 aka "[jpeg image of size 6874]" [full]
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-nbd-2019-03-08:
  iotests: Wait for qemu to end in 223
  nbd: fix outdated qapi docs syntax for tls-creds
  nbd: allow authorization with nbd-server-start QMP command
  qemu-nbd: add support for authorization of TLS clients

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-03-09 20:55:44 +00:00
Kevin Wolf
6c3944dc62 qcow2: Implement data-file-raw create option
Provide an option to force QEMU to always keep the external data file
consistent as a standalone read-only raw image.

At the moment, this means making sure that write_zeroes requests are
forwarded to the data file instead of just updating the metadata, and
checking that no backing file is used.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-08 12:26:46 +01:00
Kevin Wolf
dcc98687f8 qcow2: Creating images with external data file
This adds a .bdrv_create option to use an external data file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-08 12:26:46 +01:00
Andrey Shinkevich
9ac404c523 block: iterate_format with account of whitelisting
bdrv_iterate_format (which is currently only used for printing out the
formats supported by the block layer) doesn't take format whitelisting
into account.

This creates a problem for tests: they enumerate supported formats to
decide which tests to enable, but then discover that QEMU doesn't let
them actually use some of those formats.

To avoid that, exclude formats that are not whitelisted from
enumeration, if whitelisting is in use.  Since we have separate
whitelists for r/w and r/o, take this a parameter to
bdrv_iterate_format, and print two lists of supported formats (r/w and
r/o) in main qemu.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-03-08 12:26:45 +01:00
Daniel P. Berrange
000194556b nbd: allow authorization with nbd-server-start QMP command
As with the previous patch to qemu-nbd, the nbd-server-start QMP command
also needs to be able to specify authorization when enabling TLS encryption.

First the client must create a QAuthZ object instance using the
'object-add' command:

   {
     'execute': 'object-add',
     'arguments': {
       'qom-type': 'authz-list',
       'id': 'authz0',
       'parameters': {
         'policy': 'deny',
         'rules': [
           {
             'match': '*CN=fred',
             'policy': 'allow'
           }
         ]
       }
     }
   }

They can then reference this in the new 'tls-authz' parameter when
executing the 'nbd-server-start' command:

   {
     'execute': 'nbd-server-start',
     'arguments': {
       'addr': {
           'type': 'inet',
           'host': '127.0.0.1',
           'port': '9000'
       },
       'tls-creds': 'tls0',
       'tls-authz': 'authz0'
     }
   }

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20190227162035.18543-3-berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2019-03-06 11:05:27 -06:00
Daniel P. Berrange
b25e12daff qemu-nbd: add support for authorization of TLS clients
Currently any client which can complete the TLS handshake is able to use
the NBD server. The server admin can turn on the 'verify-peer' option
for the x509 creds to require the client to provide a x509 certificate.
This means the client will have to acquire a certificate from the CA
before they are permitted to use the NBD server. This is still a fairly
low bar to cross.

This adds a '--tls-authz OBJECT-ID' option to the qemu-nbd command which
takes the ID of a previously added 'QAuthZ' object instance. This will
be used to validate the client's x509 distinguished name. Clients
failing the authorization check will not be permitted to use the NBD
server.

For example to setup authorization that only allows connection from a client
whose x509 certificate distinguished name is

   CN=laptop.example.com,O=Example Org,L=London,ST=London,C=GB

escape the commas in the name and use:

  qemu-nbd --object tls-creds-x509,id=tls0,dir=/home/berrange/qemutls,\
                    endpoint=server,verify-peer=yes \
           --object 'authz-simple,id=auth0,identity=CN=laptop.example.com,,\
                     O=Example Org,,L=London,,ST=London,,C=GB' \
           --tls-creds tls0 \
           --tls-authz authz0 \
	   ....other qemu-nbd args...

NB: a real shell command line would not have leading whitespace after
the line continuation, it is just included here for clarity.

Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20190227162035.18543-2-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: split long line in --help text, tweak 233 to show that whitespace
after ,, in identity= portion is actually okay]
Signed-off-by: Eric Blake <eblake@redhat.com>
2019-03-06 11:05:27 -06:00
Max Reitz
998b3a1e5a block: Purify .bdrv_refresh_filename()
Currently, BlockDriver.bdrv_refresh_filename() is supposed to both
refresh the filename (BDS.exact_filename) and set BDS.full_open_options.
Now that we have generic code in the central bdrv_refresh_filename() for
creating BDS.full_open_options, we can drop the latter part from all
BlockDriver.bdrv_refresh_filename() implementations.

This also means that we can drop all of the existing default code for
this from the global bdrv_refresh_filename() itself.

Furthermore, we now have to call BlockDriver.bdrv_refresh_filename()
after having set BDS.full_open_options, because the block driver's
implementation should now be allowed to depend on BDS.full_open_options
being set correctly.

Finally, with this patch we can drop the @options parameter from
BlockDriver.bdrv_refresh_filename(); also, add a comment on this
function's purpose in block/block_int.h while touching its interface.

This completely obsoletes blklogwrite's implementation of
.bdrv_refresh_filename().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-25-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-02-25 15:11:27 +01:00
Max Reitz
abc521a9aa block: Add BlockDriver.bdrv_gather_child_options
Some follow-up patches will rework the way bs->full_open_options is
refreshed in bdrv_refresh_filename(). The new implementation will remove
the need for the block drivers' bdrv_refresh_filename() implementations
to set bs->full_open_options; instead, it will be generic and use static
information from each block driver.

However, by implementing bdrv_gather_child_options(), block drivers will
still be able to override the way the full_open_options of their
children are incorporated into their own.

We need to implement this function for VMDK because we have to prevent
the generic implementation from gathering the options of all children:
It is not possible to specify options for the extents through the
runtime options.

For quorum, the child names that would be used by the generic
implementation and the ones that we actually (currently) want to use
differ. See quorum_gather_child_options() for more information.

Note that both of these are cases which are not ideal: In case of VMDK
it would probably be nice to be able to specify options for all extents.
In case of quorum, the current runtime option structure is simply broken
and needs to be fixed (but that is left for another patch).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-23-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-02-25 15:11:27 +01:00
Max Reitz
2654267cc1 block: Add strong_runtime_opts to BlockDriver
This new field can be set by block drivers to list the runtime options
they accept that may influence the contents of the respective BDS. As of
a follow-up patch, this list will be used by the common
bdrv_refresh_filename() implementation to decide which options to put
into BDS.full_open_options (and consequently whether a JSON filename has
to be created), thus freeing the drivers of having to implement that
logic themselves.

Additionally, this patch adds the field to all of the block drivers that
need it and sets it accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-22-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-02-25 15:11:27 +01:00
Max Reitz
1e89d0f9be block: Add bdrv_dirname()
This function may be implemented by block drivers to derive a directory
name from a BDS. Concatenating this g_free()-able string with a relative
filename must result in a valid (not necessarily existing) filename, so
this is a function that should generally be not implemented by format
drivers, because this is protocol-specific.

If a BDS's driver does not implement this function, bdrv_dirname() will
fall through to the BDS's file if it exists. If it does not, the
exact_filename field will be used to generate a directory name.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-15-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-02-25 15:11:26 +01:00
Max Reitz
6b6833c1b4 block: bdrv_get_full_backing_filename's ret. val.
Make bdrv_get_full_backing_filename() return an allocated string instead
of placing the result in a caller-provided buffer.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-12-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-02-25 15:11:26 +01:00
Max Reitz
645ae7d88e block: bdrv_get_full_backing_filename_from_...'s ret. val.
Make bdrv_get_full_backing_filename_from_filename() return an allocated
string instead of placing the result in a caller-provided buffer.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-11-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-02-25 15:11:26 +01:00
Max Reitz
009b03aaa2 block: Make path_combine() return the path
Besides being safe for arbitrary path lengths, after some follow-up
patches all callers will want a freshly allocated buffer anyway.

In the meantime, path_combine_deprecated() is added which has the same
interface as path_combine() had before this patch. All callers to that
function will be converted in follow-up patches.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20190201192935.18394-10-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-02-25 15:11:26 +01:00
Max Reitz
998c201923 block: Add BDS.auto_backing_file
If the backing file is overridden, this most probably does change the
guest-visible data of a BDS.  Therefore, we will need to consider this
in bdrv_refresh_filename().

To see whether it has been overridden, we might want to compare
bs->backing_file and bs->backing->bs->filename.  However,
bs->backing_file is changed by bdrv_set_backing_hd() (which is just used
to change the backing child at runtime, without modifying the image
header), so bs->backing_file most of the time simply contains a copy of
bs->backing->bs->filename anyway, so it is useless for such a
comparison.

This patch adds an auto_backing_file BDS field which contains the
backing file path as indicated by the image header, which is not changed
by bdrv_set_backing_hd().

Because of bdrv_refresh_filename() magic, however, a BDS's filename may
differ from what has been specified during bdrv_open().  Then, the
comparison between bs->auto_backing_file and bs->backing->bs->filename
may fail even though bs->backing was opened from bs->auto_backing_file.
To mitigate this, we can copy the real BDS's filename (after the whole
bdrv_open() and bdrv_refresh_filename() process) into
bs->auto_backing_file, if we know the former has been opened based on
the latter.  This is only possible if no options modifying the backing
file's behavior have been specified, though.  To simplify things, this
patch only copies the filename from the backing file if no options have
been specified for it at all.

Furthermore, there are cases where an overlay is created by qemu which
already contains a BDS's filename (e.g. in blockdev-snapshot-sync).  We
do not need to worry about updating the overlay's bs->auto_backing_file
there, because we actually wrote a post-bdrv_refresh_filename() filename
into the image header.

So all in all, there will be false negatives where (as of a future
patch) bdrv_refresh_filename() will assume that the backing file differs
from what was specified in the image header, even though it really does
not.  However, these cases should be limited to where (1) the user
actually did override something in the backing chain (e.g. by specifying
options for the backing file), or (2) the user executed a QMP command to
change some node's backing file (e.g. change-backing-file or
block-commit with @backing-file given) where the given filename does not
happen to coincide with qemu's idea of the backing BDS's filename.

Then again, (1) really is limited to -drive.  With -blockdev or
blockdev-add, you have to adhere to the schema, so a user cannot give
partial "unimportant" options (e.g. by just setting backing.node-name
and leaving the rest to the image header).  Therefore, trying to fix
this would mean trying to fix something for -drive only.

To improve on (2), we would need a full infrastructure to "canonicalize"
an arbitrary filename (+ options), so it can be compared against
another.  That seems a bit over the top, considering that filenames
nowadays are there mostly for the user's entertainment.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20190201192935.18394-5-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-02-25 15:11:25 +01:00
Max Reitz
f30c66ba6e block: Use bdrv_refresh_filename() to pull
Before this patch, bdrv_refresh_filename() is used in a pushing manner:
Whenever the BDS graph is modified, the parents of the modified edges
are supposed to be updated (recursively upwards).  However, that is
nonviable, considering that we want child changes not to concern
parents.

Also, in the long run we want a pull model anyway: Here, we would have a
bdrv_filename() function which returns a BDS's filename, freshly
constructed.

This patch is an intermediate step.  It adds bdrv_refresh_filename()
calls before every place a BDS.filename value is used.  The only
exceptions are protocol drivers that use their own filename, which
clearly would not profit from refreshing that filename before.

Also, bdrv_get_encrypted_filename() is removed along the way (as a user
of BDS.filename), since it is completely unused.

In turn, all of the calls to bdrv_refresh_filename() before this patch
are removed, because we no longer have to call this function on graph
changes.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20190201192935.18394-2-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2019-02-25 15:11:25 +01:00
Vladimir Sementsov-Ogievskiy
f962e96150 block: fix bdrv_check_perm for non-tree subgraph
bdrv_check_perm in it's recursion checks each node in context of new
permissions for one parent, because of nature of DFS. It works well,
while children subgraph of top-most updated node is a tree, i.e. it
doesn't have any kind of loops. But if we have a loop (not oriented,
of course), i.e. we have two different ways from top-node to some
child-node, then bdrv_check_perm will do wrong thing:

  top
  | \
  |  |
  v  v
  A  B
  |  |
  v  v
  node

It will once check new permissions of node in context of new A
permissions and old B permissions and once visa-versa. It's a wrong way
and may lead to corruption of permission system. We may start with
no-permissions and all-shared for both A->node and B->node relations
and finish up with non shared write permission for both ways.

The following commit will add a test, which shows this bug.

To fix this situation, let's really set BdrvChild permissions during
bdrv_check_perm procedure. And we are happy here, as check-perm is
already written in transaction manner, so we just need to restore
backed-up permissions in _abort.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-02-25 15:03:19 +01:00
Kevin Wolf
d3bd5b9089 nbd: Use low-level QIOChannel API in nbd_read_eof()
Instead of using the convenience wrapper qio_channel_read_all_eof(), use
the lower level QIOChannel API. This means duplicating some code, but
we'll need this because this coroutine yield is special: We want it to
be interruptible so that nbd_client_attach_aio_context() can correctly
reenter the coroutine.

This moves the bdrv_dec/inc_in_flight() pair into nbd_read_eof(), so
that connection_co will always sit in this exact qio_channel_yield()
call when bdrv_drain() returns.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-02-25 15:03:19 +01:00
Kevin Wolf
a7b78fc944 nbd: Move nbd_read_eof() to nbd/client.c
The only caller of nbd_read_eof() is nbd_receive_reply(), so it doesn't
have to live in the header file, but can move next to its caller.

Also add the missing coroutine_fn to the function and its caller.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2019-02-25 15:03:19 +01:00
Daniel Henrique Barboza
8c04093c8c block/snapshot: remove bdrv_snapshot_delete_by_id_or_name
After the previous patch, the only instance of this function left
is inside qemu-img.c.

qemu-img is using it inside the 'img_snapshot' function to delete
snapshots in the SNAPSHOT_DELETE case, based on a "snapshot_name"
string that refers to the tag, not ID, of the QEMUSnapshotInfo struct.
This can be verified by checking the SNAPSHOT_CREATE case that
comes shortly before SNAPSHOT_DELETE. In that case, the same
"snapshot_name" variable is being strcpy to the 'name' field
of the QEMUSnapshotInfo struct sn:

pstrcpy(sn.name, sizeof(sn.name), snapshot_name);

Based on that, it is unlikely that "snapshot_name" might contain
an "id" in SNAPSHOT_DELETE.

This patch changes SNAPSHOT_DELETE to use snapshot_find() and
snapshot_delete() instead of bdrv_snapshot_delete_by_id_or_name.
After that, there is no instances left of bdrv_snapshot_delete_by_id_or_name
in the code, so it is safe to remove it entirely.

Suggested-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2019-02-25 15:03:18 +01:00
Andrey Shinkevich
1bf6e9ca92 bdrv_query_image_info Error parameter added
Inform a user in case qcow2_get_specific_info fails to obtain
QCOW2 image specific information. This patch is preliminary to
the one "qcow2: Add list of bitmaps to ImageInfoSpecificQCow2".

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1549638368-530182-2-git-send-email-andrey.shinkevich@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2019-02-11 14:35:43 -06:00