Commit Graph

3364 Commits

Author SHA1 Message Date
Eric Blake 004a89fce9 nbd: Create struct for tracking export info
The NBD Protocol is introducing some additional information
about exports, such as minimum request size and alignment, as
well as an advertised maximum request size.  It will be easier
to feed this information back to the block layer if we gather
all the information into a struct, rather than adding yet more
pointer parameters during negotiation.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170707203049.534-2-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-14 12:04:41 +02:00
Peter Maydell a309b290aa Error reporting patches for 2017-07-13
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZZ1/BAAoJEDhwtADrkYZTo7oP+gLj4B4kkp/DJnkzfuMMD1Ce
 ZPddZ8Z9RyXE4fS66sq1ODBQo5U+aQQZO7K234+jf8V4cKWW98lpVzLc3YdAHm2U
 ZF6Z9Rji5K4414ZsUcg92Zlovvdaji+mY0ooINav+4mqlONYrz29ntApWc0e0tGc
 e3tj4XDLhJrOM+mIx8vzixFlgSYj+6HgEiybYwolEK5svQbIQao3Y2omyb+zy0w0
 RDT3XQnAAaZSOQAXcJGkhekkyMe0jMHOF0tULLx1uDQYctg9mUGlAGTZ5oTLgSve
 TCpSJwWCAx8XAJMkXyDRrdRFDLeUh6yGY7NTqAL3OuPSoAw9ygKrHyhTavxBJL+W
 rX7Qit3dmVrlZLviwNFQplAKYb10d08vBoKXmrnW5oVCmPEDvJIQfncbucpA/CNS
 ucdJ3RMLuDbbWdl+5tsL7jfiZAG7oSgAePTjN1rm0bDe5JN7NAU8WzHnKfE83iZq
 R+I3hofqGoiXSByYRLamZb+6nsURAxWPhcqcw7hdMsk7UI6dyZwWl9Fnm72w0BZK
 M5LHLkX0LYc+kZjiLKXlNK7Z50bXY0zKQpPCLH3nHA69iMiwVoozrjwa9iCKIxE+
 7ZlOfsu4ztExuicEyTr8b27CBrHjJjYDuFP0hroEOzqCKXUzegoq3oYMGP0doXxe
 o3xcwXVKT/1PudddyR4z
 =tByN
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-07-13' into staging

Error reporting patches for 2017-07-13

# gpg: Signature made Thu 13 Jul 2017 12:55:45 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-error-2017-07-13:
  Convert error_report*_err() to warn_report*_err()
  error: Implement the warn and free Error functions
  char-socket: Report TCP socket waiting as information
  Convert error_report() to warn_report()
  error: Functions to report warnings and informational messages
  util/qemu-error: Rename error_print_loc() to be more generic
  websock: Don't try to set *errp directly
  block: Don't try to set *errp directly
  xilinx: Fix latent error handling bug

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-14 09:36:40 +01:00
Alistair Francis 3dc6f86936 Convert error_report() to warn_report()
Convert all uses of error_report("warning:"... to use warn_report()
instead. This helps standardise on a single method of printing warnings
to the user.

All of the warnings were changed using these two commands:
    find ./* -type f -exec sed -i \
      's|error_report(".*warning[,:] |warn_report("|Ig' {} +

Indentation fixed up manually afterwards.

The test-qdev-global-props test case was manually updated to ensure that
this patch passes make check (as the test cases are case sensitive).

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Suggested-by: Thomas Huth <thuth@redhat.com>
Cc: Jeff Cody <jcody@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Lieven <pl@kamp.de>
Cc: Josh Durgin <jdurgin@redhat.com>
Cc: "Richard W.M. Jones" <rjones@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Greg Kurz <groug@kaod.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Peter Chubb <peter.chubb@nicta.com.au>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Alexander Graf <agraf@suse.de>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Greg Kurz <groug@kaod.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed by: Peter Chubb <peter.chubb@data61.csiro.au>
Acked-by: Max Reitz <mreitz@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Message-Id: <e1cfa2cd47087c248dd24caca9c33d9af0c499b0.1499866456.git.alistair.francis@xilinx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-07-13 13:49:58 +02:00
Max Reitz 772d1f973f block/qcow2: falloc/full preallocating growth
Implement the preallocation modes falloc and full for growing qcow2
images.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170613202107.10125-15-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:02 +02:00
Max Reitz 60c48a29b7 block/qcow2: Rename "fail_block" to just "fail"
Now alloc_refcount_block() only contains a single fail label, so it
makes more sense to just name it "fail" instead of "fail_block".

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-14-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:02 +02:00
Max Reitz 12cc30a8cb block/qcow2: Add qcow2_refcount_area()
This function creates a collection of self-describing refcount
structures (including a new refcount table) at the end of a qcow2 image
file. Optionally, these structures can also describe a number of
additional clusters beyond themselves; this will be important for
preallocated truncation, which will place the data clusters and L2
tables there.

For now, we can use this function to replace the part of
alloc_refcount_block() that grows the refcount table (from which it is
actually derived).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-13-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:02 +02:00
Max Reitz 95b98f343b block/qcow2: Metadata preallocation for truncate
We can support PREALLOC_MODE_METADATA by invoking preallocate() in
qcow2_truncate().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170613202107.10125-12-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:02 +02:00
Max Reitz 652fecd005 block/qcow2: Lock s->lock in preallocate()
preallocate() is and will be called only from places that do not
otherwise need to lock s->lock: Currently that is qcow2_create2(), as of
a future patch it will be called from qcow2_truncate(), too.

It therefore makes sense to move locking that mutex into preallocate()
itself.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-11-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:02 +02:00
Max Reitz 7bc45dc172 block/qcow2: Generalize preallocate()
This patch adds two new parameters to the preallocate() function so we
will be able to use it not just for preallocating a new image but also
for preallocated image growth.

The offset parameter allows the caller to specify a virtual offset from
which to start preallocating. For newly created images this is always 0,
but for preallocating growth this will be the old image length.

The new_length parameter specifies the supposed new length of the image
(basically the "end offset" for preallocation). During image truncation,
bdrv_getlength() will return the old image length so we cannot rely on
its return value then.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-10-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:02 +02:00
Max Reitz 35d72602ec block/file-posix: Preallocation for truncate
By using raw_regular_truncate() in raw_truncate(), we can now easily
support preallocation.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-9-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Max Reitz d0bc9e5d5e block/file-posix: Generalize raw_regular_truncate
Currently, raw_regular_truncate() is intended for setting the size of a
newly created file. However, we also want to use it for truncating an
existing file in which case only the newly added space (when growing)
should be preallocated.

This also means that if resizing failed, we should try to restore the
original file size. This is important when using preallocation.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-8-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Max Reitz 9f63b07ee7 block/file-posix: Extract raw_regular_truncate()
This functionality is part of raw_create() which we will be able to
reuse nicely in raw_truncate().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170613202107.10125-7-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Max Reitz 7dacd8bd3d block/file-posix: Small fixes in raw_create()
Variables should be declared at the start of a block, and if a certain
parameter value is not supported it may be better to return -ENOTSUP
instead of -EINVAL.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170613202107.10125-6-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Max Reitz 3a691c50f1 block: Add PreallocMode to blk_truncate()
blk_truncate() itself will pass that value to bdrv_truncate(), and all
callers of blk_truncate() just set the parameter to PREALLOC_MODE_OFF
for now.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-4-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Max Reitz 7ea37c3066 block: Add PreallocMode to bdrv_truncate()
For block drivers that just pass a truncate request to the underlying
protocol, we can now pass the preallocation mode instead of aborting if
it is not PREALLOC_MODE_OFF.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-3-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Max Reitz 8243ccb743 block: Add PreallocMode to BD.bdrv_truncate()
Add a PreallocMode parameter to the bdrv_truncate() function implemented
by each block driver. Currently, we always pass PREALLOC_MODE_OFF and no
driver accepts anything else.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170613202107.10125-2-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:01 +02:00
Stefan Hajnoczi c501c35220 qcow2: add bdrv_measure() support
Use qcow2_calc_prealloc_size() to get the required file size.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170705125738.8777-7-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:00 +02:00
Stefan Hajnoczi 0eb4a8c1df qcow2: extract image creation option parsing
The image creation options parsed by qcow2_create() are also needed to
implement .bdrv_measure().  Extract the parsing code, including input
validation.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170705125738.8777-6-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:00 +02:00
Stefan Hajnoczi 7c5bcc4212 qcow2: make refcount size calculation conservative
The refcount metadata size calculation is inaccurate and can produce
numbers that are too small.  This is bad because we should calculate a
conservative number - one that is guaranteed to be large enough.

This patch switches the approach to a fixed point calculation because
the existing equation is hard to solve when inaccuracies are taken care
of.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170705125738.8777-5-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:00 +02:00
Stefan Hajnoczi 95c67e3bd7 qcow2: extract preallocation calculation function
Calculating the preallocated image size will be needed to implement
.bdrv_measure().  Extract the code out into a separate function.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170705125738.8777-4-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:00 +02:00
Stefan Hajnoczi a843a22a82 raw-format: add bdrv_measure() support
Maximum size calculation is trivial for the raw format: it's just the
requested image size (because there is no metadata).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170705125738.8777-3-stefanha@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:45:00 +02:00
Vladimir Sementsov-Ogievskiy 615b5dcf2d block: release persistent bitmaps on inactivate
We should release them here to reload on invalidate cache.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170628120530.31251-31-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy 469c71edc7 qcow2: add .bdrv_remove_persistent_dirty_bitmap
Realize .bdrv_remove_persistent_dirty_bitmap interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-29-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy 56f364e6d7 block/dirty-bitmap: add bdrv_remove_persistent_dirty_bitmap
Interface for removing persistent bitmap from its storage.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-28-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy a3b52535e8 qmp: add x-debug-block-dirty-bitmap-sha256
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20170628120530.31251-26-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy da0eb242ad qcow2: add .bdrv_can_store_new_dirty_bitmap
Realize .bdrv_can_store_new_dirty_bitmap interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170628120530.31251-23-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:59 +02:00
Vladimir Sementsov-Ogievskiy 169b879359 qcow2: store bitmaps on reopening image as read-only
Store bitmaps and mark them read-only on reopening image as read-only.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170628120530.31251-21-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy 5f72826e7f qcow2: add persistent dirty bitmaps support
Store persistent dirty bitmaps in qcow2 image.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170628120530.31251-20-vsementsov@virtuozzo.com
[mreitz: Always assign ret in store_bitmap() in case of an error]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy 3dd10a06d1 block/dirty-bitmap: add bdrv_dirty_bitmap_next()
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-19-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy a88b179fdb block: introduce persistent dirty bitmaps
New field BdrvDirtyBitmap.persistent means, that bitmap should be saved
by format driver in .bdrv_close and .bdrv_inactivate. No format driver
supports it for now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20170628120530.31251-18-vsementsov@virtuozzo.com
[mreitz: Fixed indentation]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy a0319aacd4 block/dirty-bitmap: add autoload field to BdrvDirtyBitmap
Mirror AUTO flag from Qcow2 bitmap in BdrvDirtyBitmap. This will be
needed in future, to save this flag back to Qcow2 for persistent
bitmaps.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20170628120530.31251-16-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy 1b6b0562db qcow2: support .bdrv_reopen_bitmaps_rw
Realize bdrv_reopen_bitmaps_rw interface.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170628120530.31251-15-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy d1258dd0c8 qcow2: autoloading dirty bitmaps
Auto loading bitmaps are bitmaps in Qcow2, with the AUTO flag set. They
are loaded when the image is opened and become BdrvDirtyBitmaps for the
corresponding drive.

Extra data in bitmaps is not supported for now.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-12-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy d6883bc968 block/dirty-bitmap: add readonly field to BdrvDirtyBitmap
It will be needed in following commits for persistent bitmaps.
If bitmap is loaded from read-only storage (and we can't mark it
"in use" in this storage) corresponding BdrvDirtyBitmap should be
read-only.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20170628120530.31251-11-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy 8bfc932e1e block/dirty-bitmap: fix comment for BlockDirtyBitmap.disabled field
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170628120530.31251-10-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:58 +02:00
Vladimir Sementsov-Ogievskiy 88ddffae8f qcow2: add bitmaps extension
Add bitmap extension as specified in docs/specs/qcow2.txt.
For now, just mirror extension header into Qcow2 state and check
constraints. Also, calculate refcounts for qcow2 bitmaps, to not break
qemu-img check.

For now, disable image resize if it has bitmaps. It will be fixed later.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-9-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:57 +02:00
Vladimir Sementsov-Ogievskiy 8a5bb1f114 qcow2-refcount: rename inc_refcounts() and make it public
This is needed for the following patch, which will introduce refcounts
checking for qcow2 bitmaps.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-8-vsementsov@virtuozzo.com
[mreitz: s/inc_refcounts/qcow2_inc_refcounts_imrt/ in one more (new)
         place]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:57 +02:00
Vladimir Sementsov-Ogievskiy 6bdc8b719a block/dirty-bitmap: add deserialize_ones func
Add bdrv_dirty_bitmap_deserialize_ones() function, which is needed for
qcow2 bitmap loading, to handle unallocated bitmap parts, marked as
all-ones.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170628120530.31251-7-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:57 +02:00
Vladimir Sementsov-Ogievskiy ba06ff1a5c block: fix bdrv_dirty_bitmap_granularity signature
Make getter signature const-correct. This allows other functions with
const dirty bitmap parameter use bdrv_dirty_bitmap_granularity().

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170628120530.31251-6-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:57 +02:00
Daniel P. Berrange 0a12f6f80e qcow2: report encryption specific image information
Currently 'qemu-img info' reports a simple "encrypted: yes"
field. This is not very useful now that qcow2 can support
multiple encryption formats. Users want to know which format
is in use and some data related to it.

Wire up usage of the qcrypto_block_get_info() method so that
'qemu-img info' can report about the encryption format
and parameters in use

  $ qemu-img create \
      --object secret,id=sec0,data=123456 \
      -o encrypt.format=luks,encrypt.key-secret=sec0 \
      -f qcow2 demo.qcow2 1G
  Formatting 'demo.qcow2', fmt=qcow2 size=1073741824 \
  encryption=off encrypt.format=luks encrypt.key-secret=sec0 \
  cluster_size=65536 lazy_refcounts=off refcount_bits=16

  $ qemu-img info demo.qcow2
  image: demo.qcow2
  file format: qcow2
  virtual size: 1.0G (1073741824 bytes)
  disk size: 480K
  encrypted: yes
  cluster_size: 65536
  Format specific information:
      compat: 1.1
      lazy refcounts: false
      refcount bits: 16
      encrypt:
          ivgen alg: plain64
          hash alg: sha256
          cipher alg: aes-256
          uuid: 3fa930c4-58c8-4ef7-b3c5-314bb5af21f3
          format: luks
          cipher mode: xts
          slots:
              [0]:
                  active: true
                  iters: 1839058
                  key offset: 4096
                  stripes: 4000
              [1]:
                  active: false
                  key offset: 262144
              [2]:
                  active: false
                  key offset: 520192
              [3]:
                  active: false
                  key offset: 778240
              [4]:
                  active: false
                  key offset: 1036288
              [5]:
                  active: false
                  key offset: 1294336
              [6]:
                  active: false
                  key offset: 1552384
              [7]:
                  active: false
                  key offset: 1810432
          payload offset: 2068480
          master key iters: 438487
      corrupt: false

With the legacy "AES" encryption we just report the format
name

  $ qemu-img create \
      --object secret,id=sec0,data=123456 \
      -o encrypt.format=aes,encrypt.key-secret=sec0 \
      -f qcow2 demo.qcow2 1G
  Formatting 'demo.qcow2', fmt=qcow2 size=1073741824 \
  encryption=off encrypt.format=aes encrypt.key-secret=sec0 \
  cluster_size=65536 lazy_refcounts=off refcount_bits=16

  $ ./qemu-img info demo.qcow2
  image: demo.qcow2
  file format: qcow2
  virtual size: 1.0G (1073741824 bytes)
  disk size: 196K
  encrypted: yes
  cluster_size: 65536
  Format specific information:
      compat: 1.1
      lazy refcounts: false
      refcount bits: 16
      encrypt:
          format: aes
      corrupt: false

Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-20-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:57 +02:00
Daniel P. Berrange 1cd9a787a2 block: pass option prefix down to crypto layer
While the crypto layer uses a fixed option name "key-secret",
the upper block layer may have a prefix on the options. e.g.
"encrypt.key-secret", in order to avoid clashes between crypto
option names & other block option names. To ensure the crypto
layer can report accurate error messages, we must tell it what
option name prefix was used.

Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-19-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:56 +02:00
Daniel P. Berrange c01c214b69 block: remove all encryption handling APIs
Now that all encryption keys must be provided upfront via
the QCryptoSecret API and associated block driver properties
there is no need for any explicit encryption handling APIs
in the block layer. Encryption can be handled transparently
within the block driver. We only retain an API for querying
whether an image is encrypted or not, since that is a
potentially useful piece of metadata to report to the user.

Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-18-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:56 +02:00
Daniel P. Berrange 4652b8f3e1 qcow2: add support for LUKS encryption format
This adds support for using LUKS as an encryption format
with the qcow2 file, using the new encrypt.format parameter
to request "luks" format. e.g.

  # qemu-img create --object secret,data=123456,id=sec0 \
       -f qcow2 -o encrypt.format=luks,encrypt.key-secret=sec0 \
       test.qcow2 10G

The legacy "encryption=on" parameter still results in
creation of the old qcow2 AES format (and is equivalent
to the new 'encryption-format=aes'). e.g. the following are
equivalent:

  # qemu-img create --object secret,data=123456,id=sec0 \
       -f qcow2 -o encryption=on,encrypt.key-secret=sec0 \
       test.qcow2 10G

 # qemu-img create --object secret,data=123456,id=sec0 \
       -f qcow2 -o encryption-format=aes,encrypt.key-secret=sec0 \
       test.qcow2 10G

With the LUKS format it is necessary to store the LUKS
partition header and key material in the QCow2 file. This
data can be many MB in size, so cannot go into the QCow2
header region directly. Thus the spec defines a FDE
(Full Disk Encryption) header extension that specifies
the offset of a set of clusters to hold the FDE headers,
as well as the length of that region. The LUKS header is
thus stored in these extra allocated clusters before the
main image payload.

Aside from all the cryptographic differences implied by
use of the LUKS format, there is one further key difference
between the use of legacy AES and LUKS encryption in qcow2.
For LUKS, the initialiazation vectors are generated using
the host physical sector as the input, rather than the
guest virtual sector. This guarantees unique initialization
vectors for all sectors when qcow2 internal snapshots are
used, thus giving stronger protection against watermarking
attacks.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-14-berrange@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:56 +02:00
Daniel P. Berrange b25b387fa5 qcow2: convert QCow2 to use QCryptoBlock for encryption
This converts the qcow2 driver to make use of the QCryptoBlock
APIs for encrypting image content, using the legacy QCow2 AES
scheme.

With this change it is now required to use the QCryptoSecret
object for providing passwords, instead of the current block
password APIs / interactive prompting.

  $QEMU \
    -object secret,id=sec0,file=/home/berrange/encrypted.pw \
    -drive file=/home/berrange/encrypted.qcow2,encrypt.key-secret=sec0

The test 087 could be simplified since there is no longer a
difference in behaviour when using blockdev_add with encrypted
images for the running vs stopped CPU state.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-12-berrange@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:56 +02:00
Daniel P. Berrange 446d306d23 qcow2: make qcow2_encrypt_sectors encrypt in place
Instead of requiring separate input/output buffers for
encrypting data, change qcow2_encrypt_sectors() to assume
use of a single buffer, encrypting in place. The current
callers all used the same buffer for input/output already.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-11-berrange@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:56 +02:00
Daniel P. Berrange d85f4222b4 qcow: convert QCow to use QCryptoBlock for encryption
This converts the qcow driver to make use of the QCryptoBlock
APIs for encrypting image content. This is only wired up to
permit use of the legacy QCow encryption format. Users who wish
to have the strong LUKS format should switch to qcow2 instead.

With this change it is now required to use the QCryptoSecret
object for providing passwords, instead of the current block
password APIs / interactive prompting.

  $QEMU \
    -object secret,id=sec0,file=/home/berrange/encrypted.pw \
    -drive file=/home/berrange/encrypted.qcow,encrypt.format=aes,\
           encrypt.key-secret=sec0

Though note that running QEMU system emulators with the AES
encryption is no longer supported, so while the above syntax
is valid, QEMU will refuse to actually run the VM in this
particular example.

Likewise when creating images with the legacy AES-CBC format

  qemu-img create -f qcow \
    --object secret,id=sec0,file=/home/berrange/encrypted.pw \
    -o encrypt.format=aes,encrypt.key-secret=sec0 \
    /home/berrange/encrypted.qcow 64M

Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-10-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:56 +02:00
Daniel P. Berrange 1fad1f9400 qcow: make encrypt_sectors encrypt in place
Instead of requiring separate input/output buffers for
encrypting data, change encrypt_sectors() to assume
use of a single buffer, encrypting in place. One current
caller uses the same buffer for input/output already
and the other two callers are easily converted to do so.

Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-9-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:56 +02:00
Daniel P. Berrange 0cb8d47ba9 block: deprecate "encryption=on" in favor of "encrypt.format=aes"
Historically the qcow & qcow2 image formats supported a property
"encryption=on" to enable their built-in AES encryption. We'll
soon be supporting LUKS for qcow2, so need a more general purpose
way to enable encryption, with a choice of formats.

This introduces an "encrypt.format" option, which will later be
joined by a number of other "encrypt.XXX" options. The use of
a "encrypt." prefix instead of "encrypt-" is done to facilitate
mapping to a nested QAPI schema at later date.

e.g. the preferred syntax is now

  qemu-img create -f qcow2 -o encrypt.format=aes demo.qcow2

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-8-berrange@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:55 +02:00
Daniel P. Berrange 6aa837f7bd qcow: require image size to be > 1 for new images
The qcow driver refuses to open images which are less than
2 bytes in size, but will happily create such images. Add
a check in the create path to avoid this discrepancy.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-5-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:55 +02:00
Daniel P. Berrange 4a47f85431 block: add ability to set a prefix for opt names
When integrating the crypto support with qcow/qcow2, we don't
want to use the bare LUKS option names "hash-alg", "key-secret",
etc. We need to namespace them to match the nested QAPI schema.

e.g. "encrypt.hash-alg", "encrypt.key-secret"

so that they don't clash with any general qcow options at a later
date.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-3-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:55 +02:00
Daniel P. Berrange 306a06e5f7 block: expose crypto option names / defs to other drivers
The block/crypto.c defines a set of QemuOpts that provide
parameters for encryption. This will also be needed by
the qcow/qcow2 integration, so expose the relevant pieces
in a new block/crypto.h header. Some helper methods taking
QemuOpts are changed to take QDict to simplify usage in
other places.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170623162419.26068-2-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-07-11 17:44:55 +02:00
Eric Blake 51b0a48888 block: Make bdrv_is_allocated_above() byte-based
We are gradually moving away from sector-based interfaces, towards
byte-based.  In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.

Changing the signature of the function to use int64_t *pnum ensures
that the compiler enforces that all callers are updated.  For now,
the io.c layer still assert()s that all callers are sector-aligned,
but that can be relaxed when a later patch implements byte-based
block status.  Therefore, for the most part this patch is just the
addition of scaling at the callers followed by inverse scaling at
bdrv_is_allocated().  But some code, particularly stream_run(),
gets a lot simpler because it no longer has to mess with sectors.
Leave comments where we can further simplify by switching to
byte-based iterations, once later patches eliminate the need for
sector-aligned operations.

For ease of review, bdrv_is_allocated() was tackled separately.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:07 +02:00
Eric Blake c00716beb3 block: Minimize raw use of bds->total_sectors
bdrv_is_allocated_above() was relying on intermediate->total_sectors,
which is a field that can have stale contents depending on the value
of intermediate->has_variable_length.  An audit shows that we are safe
(we were first calling through bdrv_co_get_block_status() which in
turn calls bdrv_nb_sectors() and therefore just refreshed the current
length), but it's nicer to favor our accessor functions to avoid having
to repeat such an audit, even if it means refresh_total_sectors() is
called more frequently.

Suggested-by: John Snow <jsnow@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:07 +02:00
Eric Blake d6a644bbfe block: Make bdrv_is_allocated() byte-based
We are gradually moving away from sector-based interfaces, towards
byte-based.  In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.

Changing the signature of the function to use int64_t *pnum ensures
that the compiler enforces that all callers are updated.  For now,
the io.c layer still assert()s that all callers are sector-aligned
on input and that *pnum is sector-aligned on return to the caller,
but that can be relaxed when a later patch implements byte-based
block status.  Therefore, this code adds usages like
DIV_ROUND_UP(,BDRV_SECTOR_SIZE) to callers that still want aligned
values, where the call might reasonbly give non-aligned results
in the future; on the other hand, no rounding is needed for callers
that should just continue to work with byte alignment.

For the most part this patch is just the addition of scaling at the
callers followed by inverse scaling at bdrv_is_allocated().  But
some code, particularly bdrv_commit(), gets a lot simpler because it
no longer has to mess with sectors; also, it is now possible to pass
NULL if the caller does not care how much of the image is allocated
beyond the initial offset.  Leave comments where we can further
simplify once a later patch eliminates the need for sector-aligned
requests through bdrv_is_allocated().

For ease of review, bdrv_is_allocated_above() will be tackled
separately.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:07 +02:00
Eric Blake 6f8e35e241 backup: Switch backup_run() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of backups to track by bytes instead of sectors
(although we are still guaranteed that we iterate by steps that
are cluster-aligned).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake 03f5d60bbf backup: Switch backup_do_cow() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake f6ac207893 backup: Switch block_backup.h to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Continue by converting
the public interface to backup jobs (no semantic change), including
a change to CowRequest to track by bytes instead of cluster indices.

Note that this does not change the difference between the public
interface (starting point, and size of the subsequent range) and
the internal interface (starting and end points).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake cf79cdf662 backup: Switch BackupBlockJob to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Continue by converting an
internal structure (no semantic change), and all references to
tracking progress.  Drop a redundant local variable bytes_per_cluster.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake e8a81e9cad block: Drop unused bdrv_round_sectors_to_clusters()
Now that the last user [mirror_iteration()] has converted to using
bytes, we no longer need a function to round sectors to clusters.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake fb2ef7919b mirror: Switch mirror_iteration() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of mirroring to track by bytes instead of sectors
(although we are still guaranteed that we iterate by steps that
are both sector-aligned and multiples of the granularity).  Drop
the now-unused mirror_clip_sectors().

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake ae4cc8777b mirror: Switch mirror_do_read() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function, preserving all existing semantics, and adding one more
assertion that things are still sector-aligned (so that conversions
to sectors in mirror_read_complete don't need to round).

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake 782d97efec mirror: Switch mirror_cow_align() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change), and add mirror_clip_bytes() as a
counterpart to mirror_clip_sectors().  Some of the conversion is
a bit tricky, requiring temporaries to convert between units; it
will be cleared up in a following patch.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake 931e52607f mirror: Update signature of mirror_clip_sectors()
Rather than having a void function that modifies its input
in-place as the output, change the signature to reduce a layer
of indirection and return the result.

Suggested-by: John Snow <jsnow@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake e6f2419389 mirror: Switch mirror_do_zero_or_discard() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake b436982f04 mirror: Switch MirrorBlockJob to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Continue by converting an
internal structure (no semantic change), and all references to the
buffer size.

Add an assertion that our use of s->granularity >> BDRV_SECTOR_BITS
(necessary for interaction with sector-based dirty bitmaps, until
a later patch converts those to be byte-based) does not suffer from
truncation problems.

[checkpatch has a false positive on use of MIN() in this patch]

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake 317a6676a2 commit: Switch commit_run() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of committing to track by bytes instead of sectors
(although we are still guaranteed that we iterate by steps that
are sector-aligned).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake d8a9858408 commit: Switch commit_populate() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Start by converting an
internal function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake d535435f4a stream: Switch stream_run() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of streaming to track by bytes instead of sectors
(although we are still guaranteed that we iterate by steps that
are sector-aligned).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake 158c649257 stream: Drop reached_end for stream_complete()
stream_complete() skips the work of rewriting the backing file if
the job was cancelled, if data->reached_end is false, or if there
was an error detected (non-zero data->ret) during the streaming.
But note that in stream_run(), data->reached_end is only set if the
loop ran to completion, and data->ret is only 0 in two cases:
either the loop ran to completion (possibly by cancellation, but
stream_complete checks for that), or we took an early goto out
because there is no bs->backing.  Thus, we can preserve the same
semantics without the use of reached_end, by merely checking for
bs->backing (and logically, if there was no backing file, streaming
is a no-op, so there is no backing file to rewrite).

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake 8493211c02 stream: Switch stream_populate() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Start by converting an
internal function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake 5cb1a49e01 trace: Show blockjob actions via bytes, not sectors
Upcoming patches are going to switch to byte-based interfaces
instead of sector-based.  Even worse, trace_backup_do_cow_enter()
had a weird mix of cluster and sector indices.

The trace interface is low enough that there are no stability
guarantees, and therefore nothing wrong with changing our units,
even in cases like trace_backup_do_cow_skip() where we are not
changing the trace output.  So make the tracing uniformly use
bytes.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake f3e4ce4af3 blockjob: Track job ratelimits via bytes, not sectors
The user interface specifies job rate limits in bytes/second.
It's pointless to have our internal representation track things
in sectors/second, particularly since we want to move away from
sector-based interfaces.

Fix up a doc typo found while verifying that the ratelimit
code handles the scaling difference.

Repetition of expressions like 'n * BDRV_SECTOR_SIZE' will be
cleaned up later when functions are converted to iterate over
images by bytes rather than by sectors.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Hervé Poussineau 8b544293ef vvfat: change OEM name to 'MSWIN4.1'
According to specification:
"'MSWIN4.1' is the recommanded setting, because it is the setting least likely
to cause compatibility problems. If you want to put something else in here,
that is your option, but the result may be that some FAT drivers might not
recognize the volume."

Specification: "FAT: General overview of on-disk format" v1.03, page 9
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Hervé Poussineau 78f002c901 vvfat: handle KANJI lead byte 0xe5
Specification: "FAT: General overview of on-disk format" v1.03, page 23
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Hervé Poussineau 6817efea3a vvfat: limit number of entries in root directory in FAT12/FAT16
FAT12/FAT16 root directory is two sectors in size, which allows only 512 directory entries.
Prevent QEMU startup if too much files exist, instead of overflowing root directory.

Also introduce variable root_entries, which will be required for FAT32.

Fixes: https://bugs.launchpad.net/qemu/+bug/1599539/comments/4
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Hervé Poussineau 339cebcc01 vvfat: correctly generate numeric-tail of short file names
More specifically:
- try without numeric-tail only if LFN didn't have invalid short chars
- start at ~1 (instead of ~0)
- handle case if numeric tail is more than one char (ie > 10)

Windows 9x Scandisk doesn't see anymore mismatches between short file names and
long file names for non-ASCII filenames.

Specification: "FAT: General overview of on-disk format" v1.03, page 31
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Hervé Poussineau 0c36111f57 vvfat: correctly create base short names for non-ASCII filenames
More specifically, create short name from filename and change blacklist of
invalid chars to whitelist of valid chars.

Windows 9x also now correctly see long file names of filenames containing a space,
but Scandisk still complains about mismatch between SFN and LFN.

[kwolf: Build fix for this intermediate patch (it included declarations
 for variables that are only used in the next patch) ]

Specification: "FAT: General overview of on-disk format" v1.03, pages 30-31
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Hervé Poussineau 09ec4119fb vvfat: correctly create long names for non-ASCII filenames
Assume that input filename is encoded as UTF-8, so correctly create UTF-16 encoding.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Hervé Poussineau f82d92bb02 vvfat: always create . and .. entries at first and in that order
readdir() doesn't always return . and .. entries at first and in that order.
This leads to not creating them at first in the directory, which raises some
errors on file system checking utilities like MS-DOS Scandisk.

Specification: "FAT: General overview of on-disk format" v1.03, page 25

Fixes: https://bugs.launchpad.net/qemu/+bug/1599539
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Hervé Poussineau 92e28d8220 vvfat: fix field names in FAT12/FAT16 and FAT32 boot sectors
Specification: "FAT: General overview of on-disk format" v1.03, pages 11-13
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Hervé Poussineau 4dc705dc7e vvfat: introduce offset_to_bootsector, offset_to_fat and offset_to_root_dir
- offset_to_bootsector is the number of sectors up to FAT bootsector
- offset_to_fat is the number of sectors up to first File Allocation Table
- offset_to_root_dir is the number of sectors up to root directory sector

Replace first_sectors_number - 1 by offset_to_bootsector.
Replace first_sectors_number by offset_to_fat.
Replace faked_sectors by offset_to_rootdir.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Hervé Poussineau ad05b31857 vvfat: rename useless enumeration values
MODE_FAKED and MODE_RENAMED are not and were never used.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Hervé Poussineau 5f5b29dfce vvfat: fix typos
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Hervé Poussineau d6a7e54ed3 vvfat: replace tabs by 8 spaces
This was a complete mess. On 2299 indented lines:
- 1329 were with spaces only
- 617 with tabulations only
- 353 with spaces and tabulations

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Hervé Poussineau 139921aaa7 vvfat: fix qemu-img map and qemu-img convert
- bs->total_sectors is the number of sectors of the whole disk
- s->sector_count is the number of sectors of the FAT partition

This fixes the following assert in qemu-img map:
qemu-img.c:2641: get_block_status: Assertion `nb_sectors' failed.

This also fixes an infinite loop in qemu-img convert.

Fixes: 4480e0f924
Fixes: https://bugs.launchpad.net/qemu/+bug/1599539
Cc: qemu-stable@nongnu.org
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Eric Blake 544daf6679 blkdebug: Support .bdrv_co_get_block_status
Without a passthrough status of BDRV_BLOCK_RAW, anything wrapped by
blkdebug appears 100% allocated as data.  Better is treating it the
same as the underlying file being wrapped.

Update iotest 177 for the new expected output.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Eric Blake d5254033da block: Simplify use of BDRV_BLOCK_RAW
The lone caller that cares about a return of BDRV_BLOCK_RAW
(namely, io.c:bdrv_co_get_block_status) completely replaces the
return value, so there is no point in passing BDRV_BLOCK_DATA.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Eric Blake 81c219ac6c block: Guarantee that *file is set on bdrv_get_block_status()
We document that *file is valid if the return is not an error and
includes BDRV_BLOCK_OFFSET_VALID, but forgot to obey this contract
when a driver (such as blkdebug) lacks a callback.  Messed up in
commit 67a0fd2 (v2.6), when we added the file parameter.

Enhance qemu-iotest 177 to cover this, using a sequence that would
print garbage or even SEGV, because it was dererefencing through
uninitialized memory.  [The resulting test output shows that we
have less-than-ideal block status from the blkdebug driver, but
that's a separate fix coming up soon.]

Setting *file on all paths that return BDRV_BLOCK_OFFSET_VALID is
enough to fix the crash, but we can go one step further: always
setting *file, even on error, means that a broken caller that
blindly dereferences file without checking for error is now more
likely to get a reliable SEGV instead of randomly acting on garbage,
making it easier to diagnose such buggy callers.  Adding an
assertion that file is set where expected doesn't hurt either.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:05 +02:00
Paolo Bonzini 96d06835dc nbd: fix NBD over TLS
When attaching the NBD QIOChannel to an AioContext, the TLS channel should
be used, not the underlying socket channel.  This is because, trivially,
the TLS channel will be the one that we read/write to and thus the one
that will get the qio_channel_yield() call.

Fixes: ff82911cd3
Cc: qemu-stable@nongnu.org
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Tested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:30:03 +02:00
Eric Blake c61e684e44 block: Exploit BDRV_BLOCK_EOF for larger zero blocks
When we have a BDS with unallocated clusters, but asking the status
of its underlying bs->file or backing layer encounters an end-of-file
condition, we know that the rest of the unallocated area will read as
zeroes.  However, pre-patch, this required two separate calls to
bdrv_get_block_status(), as the first call stops at the point where
the underlying file ends.  Thanks to BDRV_BLOCK_EOF, we can now widen
the results of the primary status if the secondary status already
includes BDRV_BLOCK_ZERO.

In turn, this fixes a TODO mentioned in iotest 154, where we can now
see that all sectors in a partial cluster at the end of a file read
as zero when coupling the shorter backing file's status along with our
knowledge that the remaining sectors came from an unallocated cluster.

Also, note that the loop in bdrv_co_get_block_status_above() had an
inefficent exit: in cases where the active layer sets BDRV_BLOCK_ZERO
but does NOT set BDRV_BLOCK_ALLOCATED (namely, where we know we read
zeroes merely because our unallocated clusters lie beyond the backing
file's shorter length), we still ended up probing the backing layer
even though we already had a good answer.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170505021500.19315-3-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-30 21:48:06 +08:00
Eric Blake fb0d8654ff block: Add BDRV_BLOCK_EOF to bdrv_get_block_status()
Just as the block layer already sets BDRV_BLOCK_ALLOCATED as a
shortcut for subsequent operations, there are also some optimizations
that are made easier if we can quickly tell that *pnum will advance
us to the end of a file, via a new BDRV_BLOCK_EOF which gets set
by the block layer.

This just plumbs up the new bit; subsequent patches will make use
of it.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170505021500.19315-2-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-30 21:48:06 +08:00
Max Reitz f69165a8fe block: Do not strcmp() with NULL uri->scheme
uri_parse(...)->scheme may be NULL. In fact, probably every field may be
NULL, and the callers do test this for all of the other fields but not
for scheme (except for block/gluster.c; block/vxhs.c does not access
that field at all).

We can easily fix this by using g_strcmp0() instead of strcmp().

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170613205726.13544-1-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-06-26 14:54:46 +02:00
Max Reitz 05cc758a3d blkverify: Catch bs->exact_filename overflow
The bs->exact_filename field may not be sufficient to store the full
blkverify node filename. In this case, we should not generate a filename
at all instead of an unusable one.

Cc: qemu-stable@nongnu.org
Reported-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170613172006.19685-3-mreitz@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-06-26 14:54:46 +02:00
Max Reitz de81d72d3d blkdebug: Catch bs->exact_filename overflow
The bs->exact_filename field may not be sufficient to store the full
blkdebug node filename. In this case, we should not generate a filename
at all instead of an unusable one.

Cc: qemu-stable@nongnu.org
Reported-by: Qu Wenruo <quwenruo@cn.fujitsu.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170613172006.19685-2-mreitz@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-06-26 14:54:46 +02:00
Manos Pitsidianakis f5a5ca7969 block: change variable names in BlockDriverState
Change the 'int count' parameter in *pwrite_zeros, *pdiscard related
functions (and some others) to 'int bytes', as they both refer to bytes.
This helps with code legibility.

Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Message-id: 20170609101808.13506-1-el13635@mail.ntua.gr
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-06-26 14:54:46 +02:00
Kevin Wolf c5f1ad429c block: Remove bdrv_aio_readv/writev/flush()
These functions are unused now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:15 +02:00
Kevin Wolf 0f714ec706 qed: Use bdrv_co_* for coroutine_fns
All functions that are marked coroutine_fn can directly call the
bdrv_co_* version of functions instead of going through the wrapper.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:15 +02:00
Kevin Wolf 87f0d88261 qed: Add coroutine_fn to I/O path functions
Now that we stay in coroutine context for the whole request when doing
reads or writes, we can add coroutine_fn annotations to many functions
that can do I/O or yield directly.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:15 +02:00
Kevin Wolf c0e8f98927 qed: Use a coroutine for need_check_timer
This fixes the last place where we degraded from AIO to actual blocking
synchronous I/O requests. Putting it into a coroutine means that instead
of blocking, the coroutine simply yields while doing I/O.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:15 +02:00
Kevin Wolf 48cc565e76 qed: Simplify request handling
Now that we process a request in the same coroutine from beginning to
end and don't drop out of it any more, we can look like a proper
coroutine-based driver and simply call qed_aio_next_io() and get a
return value from it instead of spawning an additional coroutine that
reenters the parent when it's done.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:15 +02:00
Kevin Wolf 0806c3b5dd qed: Use CoQueue for serialising allocations
Now that we're running in coroutine context, the ad-hoc serialisation
code (which drops a request that has to wait out of coroutine context)
can be replaced by a CoQueue.

This means that when we resume a serialised request, it is running in
coroutine context again and its I/O isn't blocking any more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:15 +02:00
Kevin Wolf 89f89709c7 qed: Implement .bdrv_co_readv/writev
Most of the qed code is now synchronous and matches the coroutine model.
One notable exception is the serialisation between requests which can
still schedule a callback. Before we can replace this with coroutine
locks, let's convert the driver's external interfaces to the coroutine
versions.

We need to be careful to handle both requests that call the completion
callback directly from the calling coroutine (i.e. fully synchronous
code) and requests that involve some callback, so that we need to yield
and wait for the completion callback coming from outside the coroutine.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:15 +02:00
Kevin Wolf 018598747c qed: Remove recursion in qed_aio_next_io()
Instead of calling itself recursively as the last thing, just convert
qed_aio_next_io() into a loop.

This patch is best reviewed with 'git show -w' because most of it is
just whitespace changes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:15 +02:00
Kevin Wolf dddf8db10b qed: Remove ret argument from qed_aio_next_io()
All callers pass ret = 0, so we can just remove it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf 0596be7e6a qed: Add return value to qed_aio_read/write_data()
Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but
just return an error code and let the caller handle it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf d6daddcdeb qed: Add return value to qed_aio_write_inplace/alloc()
Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but
just return an error code and let the caller handle it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf a101341aa0 qed: Add return value to qed_aio_write_cow()
Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but
just return an error code and let the caller handle it.

While refactoring qed_aio_write_alloc() to accomodate the change,
qed_aio_write_zero_cluster() ended up with a single line, so I chose to
inline that line and remove the function completely.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf eaf0bc56f5 qed: Add return value to qed_aio_write_main()
Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but
just return an error code and let the caller handle it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf 88d2dd72bc qed: Add return value to qed_aio_write_l2_update()
Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but
just return an error code and let the caller handle it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf fb18de21e0 qed: Add return value to qed_aio_write_l1_update()
Don't recurse into qed_aio_next_io() and qed_aio_complete() here, but
just return an error code and let the caller handle it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf fae25ac7bd qed: Inline qed_commit_l2_update()
qed_commit_l2_update() is unconditionally called at the end of
qed_aio_write_l1_update(). Inline it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf a4d8f1aee1 qed: Make qed_aio_write_main() synchronous
Note that this code is generally not running in coroutine context, so
this is an actual blocking synchronous operation. We'll fix this in a
moment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf 3e248cdcd9 qed: Make qed_aio_read_data() synchronous
Note that this code is generally not running in coroutine context, so
this is an actual blocking synchronous operation. We'll fix this in a
moment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf 453e53e2a1 qed: Remove callback from qed_write_table()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf 29470d11bf qed: Remove GenericCB
The GenericCB infrastructure isn't used any more. Remove it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf 602b57fba4 qed: Make qed_write_table() synchronous
Note that this code is generally not running in coroutine context, so
this is an actual blocking synchronous operation. We'll fix this in a
moment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf f13d712bb2 qed: Remove callback from qed_write_header()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf 7076309aef qed: Make qed_write_header() synchronous
Note that this code is generally not running in coroutine context, so
this is an actual blocking synchronous operation. We'll fix this in a
moment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf b4ac32f34f qed: Remove callback from qed_copy_from_backing_file()
With this change, qed_aio_write_prefill() and qed_aio_write_postfill()
collapse into a single function. This is reflected by a rename of the
combined function to qed_aio_write_cow().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf 0f7aa24d2c qed: Make qed_copy_from_backing_file() synchronous
Note that this code is generally not running in coroutine context, so
this is an actual blocking synchronous operation. We'll fix this in a
moment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf e85c528142 qed: Make qed_read_backing_file() synchronous
Note that this code is generally not running in coroutine context, so
this is an actual blocking synchronous operation. We'll fix this in a
moment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf 0f21b7a1b7 qed: Remove callback from qed_find_cluster()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf a8165d2d66 qed: Remove callback from qed_read_l2_table()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf f6513529c6 qed: Remove callback from qed_read_table()
Instead of passing the return value to a callback, return it to the
caller so that the callback can be inlined there.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf 11273076e9 qed: Make qed_read_table() synchronous
Note that this code is generally not running in coroutine context, so
this is an actual blocking synchronous operation. We'll fix this in a
moment.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Kevin Wolf 3b7cd9fd8f qed: Use bottom half to resume waiting requests
The qed driver serialises allocating write requests. When the active
allocation is finished, the AIO callback is called, but after this, the
next allocating request is immediately processed instead of leaving the
coroutine. Resuming another allocation request in the same request
coroutine means that the request now runs in the wrong coroutine.

The following is one of the possible effects of this: The completed
request will generally reenter its request coroutine in a bottom half,
expecting that it completes the request in bdrv_driver_pwritev().
However, if the second request actually yielded before leaving the
coroutine, the reused request coroutine is in an entirely different
place and is reentered prematurely. Not a good idea.

Let's make sure that we exit the coroutine after completing the first
request by resuming the next allocating request only with a bottom
half.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-06-26 14:51:14 +02:00
Alberto Garcia 24990c5b95 qcow2: Use offset_into_cluster() and offset_to_l2_index()
We already have functions for doing these calculations, so let's use
them instead of doing everything by hand. This makes the code a bit
more readable.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia ee22a9d869 qcow2: Merge the writing of the COW regions with the guest data
If the guest tries to write data that results on the allocation of a
new cluster, instead of writing the guest data first and then the data
from the COW regions, write everything together using one single I/O
operation.

This can improve the write performance by 25% or more, depending on
several factors such as the media type, the cluster size and the I/O
request size.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia 86b862c431 qcow2: Pass a QEMUIOVector to do_perform_cow_{read,write}()
Instead of passing a single buffer pointer to do_perform_cow_write(),
pass a QEMUIOVector. This will allow us to merge the write requests
for the COW regions and the actual data into a single one.

Although do_perform_cow_read() does not strictly need to change its
API, we're doing it here as well for consistency.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia b3cf1c7cf8 qcow2: Allow reading both COW regions with only one request
Reading both COW regions requires two separate requests, but it's
perfectly possible to merge them and perform only one. This generally
improves performance, particularly on rotating disk drives. The
downside is that the data in the middle region is read but discarded.

This patch takes a conservative approach and only merges reads when
the size of the middle region is <= 16KB.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia 672f0f2c4b qcow2: Split do_perform_cow() into _read(), _encrypt() and _write()
This patch splits do_perform_cow() into three separate functions to
read, encrypt and write the COW regions.

perform_cow() can now read both regions first, then encrypt them and
finally write them to disk. The memory allocation is also done in
this function now, using one single buffer large enough to hold both
regions.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia 99450c6fb9 qcow2: Make perform_cow() call do_perform_cow() twice
Instead of calling perform_cow() twice with a different COW region
each time, call it just once and make perform_cow() handle both
regions.

This patch simply moves code around. The next one will do the actual
reordering of the COW operations.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia e034f5bcbc qcow2: Use unsigned int for both members of Qcow2COWRegion
Qcow2COWRegion has two attributes:

- The offset of the COW region from the start of the first cluster
  touched by the I/O request. Since it's always going to be positive
  and the maximum request size is at most INT_MAX, we can use a
  regular unsigned int to store this offset.

- The size of the COW region in bytes. This is guaranteed to be >= 0,
  so we should use an unsigned type instead.

In x86_64 this reduces the size of Qcow2COWRegion from 16 to 8 bytes.
It will also help keep some assertions simpler now that we know that
there are no negative numbers.

The prototype of do_perform_cow() is also updated to reflect these
changes.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia 026ac1586b qcow2: Remove unused Error variable in do_perform_cow()
We are using the return value of qcow2_encrypt_sectors() to detect
problems but we are throwing away the returned Error since we have no
way to report it to the user. Therefore we can simply get rid of the
local Error variable and pass NULL instead.

Alternatively we could try to figure out a way to pass the original
error instead of simply returning -EIO, but that would be more
invasive, so let's keep the current approach.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Alberto Garcia 0d2fac8ede throttle: Update throttle-groups.c documentation
There used to be throttle_timers_{detach,attach}_aio_context() calls
in bdrv_set_aio_context(), but since 7ca7f0f6db
they are now in blk_set_aio_context().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Stefan Hajnoczi ea17c9d20d block: use BDRV_POLL_WHILE() in bdrv_rw_vmstate()
Calling aio_poll() directly may have been fine previously, but this is
the future, man!  The difference between an aio_poll() loop and
BDRV_POLL_WHILE() is that BDRV_POLL_WHILE() releases the AioContext
around aio_poll().

This allows the IOThread to run fd handlers or BHs to complete the
request.  Failure to release the AioContext causes deadlocks.

Using BDRV_POLL_WHILE() partially fixes a 'savevm' hang with -object
iothread.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Stefan Hajnoczi dc88a467ec block: count bdrv_co_rw_vmstate() requests
Call bdrv_inc/dec_in_flight() for vmstate reads/writes.  This seems
unnecessary at first glance because vmstate reads/writes are done
synchronously while the guest is stopped.  But we need the bdrv_wakeup()
in bdrv_dec_in_flight() so the main loop sees request completion.
Besides, it's cleaner to count vmstate reads/writes like ordinary
read/write requests.

The bdrv_wakeup() partially fixes a 'savevm' hang with -object iothread.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-06-26 14:51:13 +02:00
Kevin Wolf 4f78a16fee commit: Fix completion with extra reference
commit_complete() can't assume that after its block_job_completed() the
job is actually immediately freed; someone else may still be holding
references. In this case, the op blockers on the intermediate nodes make
the graph reconfiguration in the completion code fail.

Call block_job_remove_all_bdrv() manually so that we know for sure that
any blockers on intermediate nodes are given up.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2017-06-26 14:51:12 +02:00
Peter Maydell 84e3d0725b QAPI patches for 2017-06-09
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJZSRWrAAoJEDhwtADrkYZTVOQP/RK8br2A1Cn7LeVG6jnKz5hJ
 OqyII77x8I2RachWvnwxQeHPMEVDuz3y2WjL+80s6peXlpR6y/13w7oX0f6aiJuo
 +T9khTqMv2I7HsM5UCsXAJpFPHT7r90b4x8nstY80YLGe7lA7L6yk6PGyCxHThwA
 mOiTKDw6/Xb/yZGrS2Favrun7juNpAs0Ec1IAkaA8xsEgVkd6tDv281rmHqvibl/
 //90VfJp3nHFZ12FCQ1HzA42Eigtmo/fIk9LnAzBoYG0zw0cnzjuv0BNzs/JwuUZ
 /VskeD1cViQ4yzFnPpjOavjYjTN854/JTJzm7gZ7dTQ6/l3ykoY6NDE8p1BLuHlC
 p2RKkg20EeZlpOEtMQ4g6iyG6EUxaKcEiXmQ31LqN/LJwxTYbo5B5nCHMjrt4gxe
 MqFBJQSNsJ7QjZ7Qa7pADMCi/G0m7/0dN8vBqSr4vcbLVvdbw/yb/9s33wXGrUj1
 PyXM2ymi+vvSqcXtNXKshsJLxJSJxO1tm2tRIANDTabQ00yxs8dOYnQnbQFR94fp
 6nrE2PnjZqgqk69aNDJEbngj6Tgx44nyTr1+Q17juZf9nTCE5QmBE1J0IRoykCJn
 E8+T63ZxtIxVV2yLi5xBjmZaZtPyJRGGeUXunA10SuWrHzupEcBuhFhFYd2MFM5L
 fsojALN2K3Gdx2+CmAo2
 =O9Vv
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-06-09-v2' into staging

QAPI patches for 2017-06-09

# gpg: Signature made Tue 20 Jun 2017 13:31:39 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2017-06-09-v2: (41 commits)
  tests/qdict: check more get_try_int() cases
  console: use get_uint() for "head" property
  i386/cpu: use get_uint() for "min-level"/"min-xlevel" properties
  numa: use get_uint() for "size" property
  pnv-core: use get_uint() for "core-pir" property
  pvpanic: use get_uint() for "ioport" property
  auxbus: use get_uint() for "addr" property
  arm: use get_uint() for "mp-affinity" property
  xen: use get_uint() for "max-ram-below-4g" property
  pc: use get_uint() for "hpet-intcap" property
  pc: use get_uint() for "apic-id" property
  pc: use get_uint() for "iobase" property
  acpi: use get_uint() for "pci-hole*" properties
  acpi: use get_uint() for various acpi properties
  acpi: use get_uint() for "acpi-pcihp-io*" properties
  platform-bus: use get_uint() for "addr" property
  bcm2835_fb: use {get, set}_uint() for "vcram-size" and "vcram-base"
  aspeed: use {set, get}_uint() for "ram-size" property
  pcihp: use get_uint() for "bsel" property
  pc-dimm: make "size" property uint64
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-22 11:34:39 +01:00
Peter Maydell 65a0e3e842 -----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
 
 iQEtBAABCAAXBQJZQyPmEBxmYW16QHJlZGhhdC5jb20ACgkQyjViTGqRccaWrgf/
 SCAHpi4gzWbr7AN03jP16Qy/kqNik6F7LTNSqrRbvBPb3TNchDd4z44SAghK5m/r
 +IlYQc20sBZ60tRHIHAUSF2WNcea2pj1v3ZVgjrI7hiJ3DXPiqqt/dAR/W/BLIDO
 tAHAVF6Pnrjm9DC4d2zATLDHvcHMzWOsnePh7XcOm44REbwUr3GDg6bf2+j+5yfS
 9ewmXfh8z4w1IvSn+f5B+IeCvGvJNA1D55dqcGo8Ivlg9PnElziXFaXO2s7UiLIM
 mF3eTSIbJQNNN+E+0lpRpnqQiq+Txxggu61Q4f8bOTBhEOPa3etj1ydnXMVbvX25
 6SUuBfGh51tyOIZOJz3GtA==
 =9b+J
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/famz/tags/docker-and-block-pull-request' into staging

# gpg: Signature made Fri 16 Jun 2017 01:18:46 BST
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/docker-and-block-pull-request: (23 commits)
  block: make accounting thread-safe
  block: split BlockAcctStats creation and setup
  block: introduce block_account_one_io
  block: protect modification of dirty bitmaps with a mutex
  migration/block: reset dirty bitmap before reading
  block: introduce dirty_bitmap_mutex
  block: protect tracked_requests and flush_queue with reqs_lock
  block: access write_gen with atomics
  block: use Stat64 for wr_highest_offset
  util: add stats64 module
  throttle-groups: protect throttled requests with a CoMutex
  throttle-groups: do not use qemu_co_enter_next
  throttle-groups: only start one coroutine from drained_begin
  block: access io_plugged with atomic ops
  block: access wakeup with atomic ops
  block: access serialising_in_flight with atomic ops
  block: access io_limits_disabled with atomic ops
  block: access quiesce_counter with atomic ops
  block: access copy_on_read with atomic ops
  docker: Add flex and bison to centos6 image
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-20 16:01:15 +01:00
Peter Maydell 7e56accdaf * nbd and qemu-nbd fixes (Eric, Max)
* nbd refactoring (Vladimir)
 * vhost-user-scsi, take N+1 (Felipe)
 * replace memory_region_set_fd with memory_region_init_ram_from_fd (Marc-André)
 * docs/ movement (Paolo)
 * megasas TOCTOU fixes (Paolo)
 * make async_safe_run_on_cpu work on kvm/hax accelerators (Paolo)
 * Build system and poison.h improvements (Thomas)
 * -accel thread=xxx fix (Thomas)
 * move files to accel/ (Yang Zhong)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJZQli7AAoJEL/70l94x66DYAUH/icBoGSyY70G033BxbWW+jPW
 pjqobQw3YnFmiHSkXy1Xd4LKa0dyrVdmVECOAvB0embiN6pIk2UpeT4aQNHffG3f
 Y9wMRKnp6RMLpzwRYbde4rAlitn52ecGIANtFUC6RTnL7wKCuh3beWruKmkudG8V
 ZAsmX30pPFJxtQYJ0/q3c2jB6V87h2GocRLAmy1lF5/X1pXQ2fiKOYOJvV3HDY1U
 UQ+Ixuf4tTbyBwTm4Lx2tD3+olkVTUvcATqxhI+1JKu6lE0Dgb58urYDOfvVPFDl
 98s6bU53f0gpOcb1/wKOAXKdZ2tvqRtwsP8IoPDnUpk8HGxbNWdofoxsO37vVIU=
 =zeKT
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* nbd and qemu-nbd fixes (Eric, Max)
* nbd refactoring (Vladimir)
* vhost-user-scsi, take N+1 (Felipe)
* replace memory_region_set_fd with memory_region_init_ram_from_fd (Marc-André)
* docs/ movement (Paolo)
* megasas TOCTOU fixes (Paolo)
* make async_safe_run_on_cpu work on kvm/hax accelerators (Paolo)
* Build system and poison.h improvements (Thomas)
* -accel thread=xxx fix (Thomas)
* move files to accel/ (Yang Zhong)

# gpg: Signature made Thu 15 Jun 2017 10:51:55 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (41 commits)
  vhost-user-scsi: Introduce a vhost-user-scsi sample application
  vhost-user-scsi: Introduce vhost-user-scsi host device
  qemu-doc: include version number
  docs: create interop/ subdirectory
  include/exec/poison: Mark some CONFIG defines as poisoned, too
  include/exec/poison: Add missing TARGET defines
  nbd/server: refactor nbd_trip
  nbd/server: rename rc to ret
  nbd/server: get rid of fail: return rc
  nbd/server: nbd_negotiate: fix error path
  nbd/server: remove NBDClientNewData
  nbd/server: refactor nbd_co_receive_request
  nbd/server: get rid of EAGAIN dead code
  nbd/server: refactor nbd_co_send_reply
  nbd/server: get rid of ssize_t
  nbd/server: get rid of nbd_negotiate_read and friends
  nbd: make nbd_drop public
  nbd: rename read_sync and friends
  accel: move kvm related accelerator files into accel/
  tcg: move tcg backend files into accel/tcg/
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-06-20 14:20:34 +01:00
Marc-André Lureau 01b2ffcedd qapi: merge QInt and QFloat in QNum
We would like to use a same QObject type to represent numbers, whether
they are int, uint, or floats. Getters will allow some compatibility
between the various types if the number fits other representations.

Add a few more tests while at it.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170607163635.17635-7-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[parse_stats_intervals() simplified a bit, comment in
test_visitor_in_int_overflow() tidied up, suppress bogus warnings]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-06-20 14:31:31 +02:00
Paolo Bonzini 5b50bf77ce block: make accounting thread-safe
I'm not trying too hard yet.  Later, with multiqueue support,
this may cause mutex contention or cacheline bouncing.

Cc: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-20-pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini 9caa6f3dbe block: split BlockAcctStats creation and setup
block_acct_destroy is called unconditionally in blk_delete, but there is
no BlockAcctStats function that is called unconditionally in blk_new.
Split block_acct_init in two, so that it will be possible to create a
QemuMutex in block_acct_init and destroy it in block_acct_cleanup.

Cc: Alberto Garcia <berto@igalia.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-19-pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini 39c1b4254e block: introduce block_account_one_io
This is the common code to account operations that produced actual I/O.

Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-18-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini b64bd51efa block: protect modification of dirty bitmaps with a mutex
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-17-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini 2119882c7e block: introduce dirty_bitmap_mutex
It protects only the list of dirty bitmaps; in the next patch we will
also protect their content.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-15-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini 3783fa3dd3 block: protect tracked_requests and flush_queue with reqs_lock
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-14-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini 47fec59941 block: access write_gen with atomics
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-13-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00
Paolo Bonzini f7946da274 block: use Stat64 for wr_highest_offset
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170605123908.18777-12-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-06-16 07:55:00 +08:00