Commit Graph

690 Commits

Author SHA1 Message Date
Denis V. Lunev 5a64e94251 qcow2: add tracepoints for qcow2_co_write_zeroes
This patch follows guidelines of all other tracepoints in qcow2, like ones
in qcow2_co_writev. I think that they should dump values in the same
quantities or be changed all together.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1463476543-3087-4-git-send-email-den@openvz.org>
[eblake: typo fix in commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Denis V. Lunev ba142846b0 qcow2: simplify logic in qcow2_co_write_zeroes
Unaligned requests will occupy only one cluster. This is true since the
previous commit. Simplify the code taking this consideration into
account.

In other words, the caller is now buggy if it ever passes us an unaligned
request that crosses cluster boundaries (the only requests that can cross
boundaries will be aligned).

There are no other changes so far.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1463476543-3087-3-git-send-email-den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Peter Maydell 6bd8ab6889 Block layer patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJXPdcnAAoJEH8JsnLIjy/WPEoQAK5vlRYqvQrrevMJviT4ZPUX
 cGGbabOcmfTBHGAgGwRLg+vQ043Sgu14JjtNbrsoSsBwAl9eAhAVGOimiieaY3vR
 35OOUxECswArJzK8I4XRx4KhI871Yq+8kHILPoXpF8L7YU38Zqa1D5z2dcOKYrL8
 Oy5IEfd1+Qfpxg/txKIioP5BzKVpz3V9/8GRNo0iAl7c806NoYFpnM0TXsed9Fjr
 YvUn1AdGHUF0/pV6vU46Qxz4yy1Q+cuoh923z6+YvXTcwok7PbjhAQWWA0qvSTuG
 otnPKMPBhYa6g7XOPD9Mra986vs6vBEGiPS5uqXoM5FqxF4Hc9LIeHEr+3hb+m53
 NLOmGqfct0USY9r6rXsOhZQb7nZCDuhaedv33ZfgE0T0cYxIilHs5PhgFAWfthhP
 aNJYlzbJUhqhTi7CJrJcFoGbNQDxux5qtlFo43M4vz/WYYDrwu8P7O3YO+sH0jU1
 EXJnbtztQvwfsiIEbIzvBRQl3XD9QmCfYO3lRbOwdCnd3ZLy47E2bze4gV3DwzK7
 CsBr+sa49xI8LMswPxTms+A+Inndn8O0mGI32Zi4nBKapjpy5Fb4YG6z8+WPfTKp
 Il1PsSgG84wm4YxGWty/UI4DoPY+hqlIIz1CNuRRNQtZTybLgNCK8ZKYbVlRppmf
 pGPpQ8pmqkeFLmx8hecm
 =ntKz
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Thu 19 May 2016 16:09:27 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (31 commits)
  qemu-iotests: Fix regression in 136 on aio_read invalid
  qemu-iotests: Simplify 109 with unaligned qemu-img compare
  qemu-io: Fix recent UI updates
  block: clarify error message for qmp-eject
  qemu-iotests: Some more write_zeroes tests
  qcow2: Fix write_zeroes with partially allocated backing file cluster
  qcow2: fix condition in is_zero_cluster
  block: Propagate AioContext change to all children
  block: Remove BlockDriverState.blk
  block: Don't return throttling info in query-named-block-nodes
  block: Avoid bs->blk in bdrv_next()
  block: Add bdrv_has_blk()
  block: Remove bdrv_aio_multiwrite()
  blockjob: Don't touch BDS iostatus
  blockjob: Don't set iostatus of target
  block: User BdrvChild callback for device name
  block: Use BdrvChild callbacks for change_media/resize
  block: Don't check throttled reqs in bdrv_requests_pending()
  Revert "block: Forbid I/O throttling on nodes with multiple parents for 2.6"
  block: Remove bdrv_move_feature_fields()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-19 16:54:12 +01:00
Kevin Wolf 5efdf53227 qcow2: Fix write_zeroes with partially allocated backing file cluster
In order to correctly check whether a given cluster is read as zero, we
don't only need to check whether bdrv_get_block_status_above() sets
BDRV_BLOCK_ZERO, but also if all sectors for the whole cluster have the
same status.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
2016-05-19 16:45:31 +02:00
Denis V. Lunev f575f145f4 qcow2: fix condition in is_zero_cluster
We should check for (res & BDRV_BLOCK_ZERO) only. The situation when we
will have !(res & BDRV_BLOCK_DATA) and will not have BDRV_BLOCK_ZERO is
not possible for images with bdi.unallocated_blocks_are_zero == true.

For those images where it's false, however, it can happen and we must
not consider the data zeroed then or we would corrupt the image.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-19 16:45:31 +02:00
Paolo Bonzini 58369e22cf qemu-common: stop including qemu/bswap.h from qemu-common.h
Move it to the actual users.  There are still a few includes of
qemu/bswap.h in headers; removing them is left for future work.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Fam Zheng c9e9e9c66c block: Drop superfluous invalidating bs->file from drivers
Now they are invalidated by the block layer, so it's not necessary to
do this in block drivers' implementations of .bdrv_invalidate_cache.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Denis V. Lunev 2928abce6d qcow2: improve qcow2_co_write_zeroes()
There is a possibility that qcow2_co_write_zeroes() will be called
with the partial block. This could be synthetically triggered with
    qemu-io -c "write -z 32k 4k"
and can happen in the real life in qemu-nbd. The latter happens under
the following conditions:
    (1) qemu-nbd is started with --detect-zeroes=on and is connected to the
        kernel NBD client
    (2) third party program opens kernel NBD device with O_DIRECT
    (3) third party program performs write operation with memory buffer
        not aligned to the page
In this case qcow2_co_write_zeroes() is unable to perform the operation
and mark entire cluster as zeroed and returns ENOTSUP. Thus the caller
switches to non-optimized version and writes real zeroes to the disk.

The patch creates a shortcut. If the block is read as zeroes, f.e. if
it is unallocated, the request is extended to cover full block.
User-visible situation with this block is not changed. Before the patch
the block is filled in the image with real zeroes. After that patch the
block is marked as zeroed in metadata. Thus any subsequent changes in
backing store chain are not affected.

Kevin, thank you for a cool suggestion.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake 8341f00dc2 block: Allow BDRV_REQ_FUA through blk_pwrite()
We have several block drivers that understand BDRV_REQ_FUA,
and emulate it in the block layer for the rest by a full flush.
But without a way to actually request BDRV_REQ_FUA during a
pass-through blk_pwrite(), FUA-aware block drivers like NBD are
forced to repeat the emulation logic of a full flush regardless
of whether the backend they are writing to could do it more
efficiently.

This patch just wires up a flags argument; followup patches
will actually make use of it in the NBD driver and in qemu-io.

Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:08 +02:00
Max Reitz 4e876bcf2b qcow2: Prevent backing file names longer than 1023
We reject backing file names with a length of more than 1023 characters
when opening a qcow2 file, so we should not produce such files
ourselves.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-04-12 18:06:51 +02:00
Kevin Wolf 72e775c7d9 block: Always set writeback mode in blk_new_open()
All callers of blk_new_open() either don't rely on the WCE bit set after
blk_new_open() because they explicitly set it anyway, or they pass
BDRV_O_CACHE_WB unconditionally.

This patch changes blk_new_open() so that it always enables writeback
mode and asserts that BDRV_O_CACHE_WB is clear. For those callers that
used to pass BDRV_O_CACHE_WB unconditionally, the flag is removed now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-30 12:16:01 +02:00
Daniel P. Berrange e6ff69bf5e block: move encryption deprecation warning into qcow code
For a couple of releases we have been warning

  Encrypted images are deprecated
  Support for them will be removed in a future release.
  You can use 'qemu-img convert' to convert your image to an unencrypted one.

This warning was issued by system emulators, qemu-img, qemu-nbd
and qemu-io. Such a broad warning was issued because the original
intention was to rip out all the code for dealing with encryption
inside the QEMU block layer APIs.

The new block encryption framework used for the LUKS driver does
not rely on the unloved block layer API for encryption keys,
instead using the QOM 'secret' object type. It is thus no longer
appropriate to warn about encryption unconditionally.

When the qcow/qcow2 drivers are converted to use the new encryption
framework too, it will be practical to keep AES-CBC support present
for use in qemu-img, qemu-io & qemu-nbd to allow for interoperability
with older QEMU versions and liberation of data from existing encrypted
qcow2 files.

This change moves the warning out of the generic block code and
into the qcow/qcow2 drivers. Further, the warning is set to only
appear when running the system emulators, since qemu-img, qemu-io,
qemu-nbd are expected to support qcow2 encryption long term now that
the maint burden has been eliminated.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-30 12:12:15 +02:00
Veronia Bahaa f348b6d1a5 util: move declarations out of qemu-common.h
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)

Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:17 +01:00
Eric Blake 32bafa8fdd qapi: Don't special-case simple union wrappers
Simple unions were carrying a special case that hid their 'data'
QMP member from the resulting C struct, via the hack method
QAPISchemaObjectTypeVariant.simple_union_type().  But by using
the work we started by unboxing flat union and alternate
branches, coupled with the ability to visit the members of an
implicit type, we can now expose the simple union's implicit
type in qapi-types.h:

| struct q_obj_ImageInfoSpecificQCow2_wrapper {
|     ImageInfoSpecificQCow2 *data;
| };
|
| struct q_obj_ImageInfoSpecificVmdk_wrapper {
|     ImageInfoSpecificVmdk *data;
| };
...
| struct ImageInfoSpecific {
|     ImageInfoSpecificKind type;
|     union { /* union tag is @type */
|         void *data;
|-        ImageInfoSpecificQCow2 *qcow2;
|-        ImageInfoSpecificVmdk *vmdk;
|+        q_obj_ImageInfoSpecificQCow2_wrapper qcow2;
|+        q_obj_ImageInfoSpecificVmdk_wrapper vmdk;
|     } u;
| };

Doing this removes asymmetry between QAPI's QMP side and its
C side (both sides now expose 'data'), and means that the
treatment of a simple union as sugar for a flat union is now
equivalent in both languages (previously the two approaches used
a different layer of dereferencing, where the simple union could
be converted to a flat union with equivalent C layout but
different {} on the wire, or to an equivalent QMP wire form
but with different C representation).  Using the implicit type
also lets us get rid of the simple_union_type() hack.

Of course, now all clients of simple unions have to adjust from
using su->u.member to using su->u.member.data; while this touches
a number of files in the tree, some earlier cleanup patches
helped minimize the change to the initialization of a temporary
variable rather than every single member access.  The generated
qapi-visit.c code is also affected by the layout change:

|@@ -7393,10 +7393,10 @@ void visit_type_ImageInfoSpecific_member
|     }
|     switch (obj->type) {
|     case IMAGE_INFO_SPECIFIC_KIND_QCOW2:
|-        visit_type_ImageInfoSpecificQCow2(v, "data", &obj->u.qcow2, &err);
|+        visit_type_q_obj_ImageInfoSpecificQCow2_wrapper_members(v, &obj->u.qcow2, &err);
|         break;
|     case IMAGE_INFO_SPECIFIC_KIND_VMDK:
|-        visit_type_ImageInfoSpecificVmdk(v, "data", &obj->u.vmdk, &err);
|+        visit_type_q_obj_ImageInfoSpecificVmdk_wrapper_members(v, &obj->u.vmdk, &err);
|         break;
|     default:
|         abort();

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-13-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:26 +01:00
Max Reitz efaa7c4eeb blockdev: Split monitor reference from BB creation
Before this patch, blk_new() automatically assigned a name to the new
BlockBackend and considered it referenced by the monitor. This patch
removes the implicit monitor_add_blk() call from blk_new() (and
consequently the monitor_remove_blk() call from blk_delete(), too) and
thus blk_new() (and related functions) no longer take a BB name
argument.

In fact, there is only a single point where blk_new()/blk_new_open() is
called and the new BB is monitor-owned, and that is in blockdev_init().
Besides thus relieving us from having to invent names for all of the BBs
we use in qemu-img, this fixes a bug where qemu cannot create a new
image if there already is a monitor-owned BB named "image".

If a BB and its BDS tree are created in a single operation, as of this
patch the BDS tree will be created before the BB is given a name
(whereas it was the other way around before). This results in minor
change to the output of iotest 087, whose reference output is amended
accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz a55448b368 qapi: Drop QERR_UNKNOWN_BLOCK_FORMAT_FEATURE
Just specifying a custom string is simpler in basically all places that
used it, and in addition, specifying the BB or node name is something we
generally do not do in other error messages when opening a BDS, so we
should not do it here.

This changes the output for iotest 036 (to the better, in my opinion),
so the reference output needs to be changed accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Kevin Wolf 23588797b6 qcow2: Use BB functions in .bdrv_create()
All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf 6340472c54 block: Use writeback in .bdrv_create() implementations
There's no reason to use a writethrough cache mode while creating an
image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Fam Zheng 178b4db7e5 qcow2: Assign bs->file->bs to file in qcow2_co_get_block_status
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-4-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng 67a0fd2a9b block: Add "file" output parameter to block status query functions
The added parameter can be used to return the BDS pointer which the
valid offset is referring to. Its value should be ignored unless
BDRV_BLOCK_OFFSET_VALID in ret is set.

Until block drivers fill in the right value, let's clear it explicitly
right before calling .bdrv_get_block_status.

The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest
case 102 passing, and will be fixed once all drivers return the right file
pointer.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-2-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Kevin Wolf 191fb11bdf qcow2: Make image inaccessible after failed qcow2_invalidate_cache()
If qcow2_invalidate_cache() fails, we are in a state where qcow2_close()
has already been completed, but the image hasn't been reopened yet.
Calling into any qcow2 function for an image in this state will cause
crashes.

The real solution would be to get rid of the close/open pair and instead
do an atomic reset of the involved data structures, but this isn't
trivial, so let's just make the image inaccessible for now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-01-20 13:36:24 +01:00
Kevin Wolf 140fd5a69c qcow2: Fix BDRV_O_INACTIVE handling in qcow2_invalidate_cache()
What qcow2_invalidate_cache() should do is close the image with
BDRV_O_INACTIVE set and reopen it with the flag cleared. In fact, it
used to do exactly the opposite: qcow2_close() relied on bs->open_flags,
which is already updated to have cleared BDRV_O_INACTIVE at this point,
whereas qcow2_open() was called with s->flags, which has the flag still
set. Fix this.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-01-20 13:36:23 +01:00
Kevin Wolf ec6d891224 qcow2: Implement .bdrv_inactivate
The callback has to ensure that closing or flushing the image afterwards
wouldn't cause a write access to the image files. This means that just
the caches have to be written out, which is part of the existing
.bdrv_close implementation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-01-20 13:36:23 +01:00
Kevin Wolf 04c01a5c8f block: Rename BDRV_O_INCOMING to BDRV_O_INACTIVE
Instead of covering only the state of images on the migration
destination before the migration is completed, the flag will also cover
the state of images on the migration source after completion. This
common state implies that the image is technically still open, but no
writes will happen and any cached contents will be reloaded from disk if
and when the image leaves this state.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-01-20 13:36:23 +01:00
Kevin Wolf b527c9b392 qcow2: Write full header on image creation
When creating a qcow2 image, we didn't necessarily call
qcow2_update_header(), but could end up with the basic header that
qcow2_create2() created manually. One thing that this basic header
lacks is the feature table. Let's make sure that it's always present.

This requires a few updates to test cases as well.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-01-20 13:36:23 +01:00
Kevin Wolf 1a4828c793 qcow2: Write feature table only for v3 images
Version 2 images don't have feature bits, so writing a feature table to
those images is kind of pointless.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-01-20 13:36:23 +01:00
Peter Maydell 80c71a241a block: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-01-20 13:36:23 +01:00
Markus Armbruster e43bfd9c87 error: Use error_prepend() where it makes obvious sense
Done with this Coccinelle semantic patch

    @@
    expression FMT, E1, E2;
    expression list ARGS;
    @@
    -    error_setg(E1, FMT, ARGS, error_get_pretty(E2));
    +    error_propagate(E1, E2);/*###*/
    +    error_prepend(E1, FMT/*@@@*/, ARGS);

followed by manual cleanup, first because I can't figure out how to
make Coccinelle transform strings, and second to get rid of now
superfluous error_propagate().

We now use or propagate the original error whole instead of just its
message obtained with error_get_pretty().  This avoids suppressing its
hint (see commit 50b7b00), but I can't see how the errors touched in
this commit could come with hints.  It also improves the message
printed with &error_abort when we screw up (see commit 1e9b65b).

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-01-13 15:16:17 +01:00
Denis V. Lunev b1fc8f934b qcow2: insert assert into qcow2_get_specific_info()
s->qcow_version is always set to 2 or 3. Let's assert if this is wrong.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18 14:34:43 +01:00
Max Reitz 61ce55fc02 qcow2: Invoke refcount order amendment function
Make use of qcow2_change_refcount_order() to support changing the
refcount order with qemu-img amend.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18 14:34:43 +01:00
Max Reitz c293a80927 qcow2: Use intermediate helper CB for amend
If there is more than one time-consuming operation to be performed for
qcow2_amend_options(), we need an intermediate CB which coordinates the
progress of the individual operations and passes the result to the
original status callback.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18 14:34:43 +01:00
Max Reitz 1038bbb803 qcow2: Split upgrade/downgrade paths for amend
If the image version should be upgraded, that is the first we should do;
if it should be downgraded, that is the last we should do. So split the
version change block into an upgrade part at the start and a downgrade
part at the end.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18 14:34:43 +01:00
Max Reitz 164e0f89cc qcow2: Use abort() instead of assert(false)
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18 14:34:43 +01:00
Max Reitz 29d72431ef qcow2: Use error_report() in qcow2_amend_options()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18 14:34:43 +01:00
Max Reitz 8b13976d3f block: Add opaque value to the amend CB
Add an opaque value which is to be passed to the bdrv_amend_options()
status callback.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18 14:34:43 +01:00
Kevin Wolf 5365f44dfa qcow2: Add .bdrv_join_options callback
qcow2 accepts a few driver-specific options that overlap semantically
(e.g. "overlap-check" is an alias of "overlap-check.template", and any
missing cache size option is derived from the given ones).

When bdrv_reopen() merges the set of updated options with left out
options that should be kept at their old value, we need to consider this
and filter out any duplicates (which would generally cause errors
because new and old value would contradict each other).

This patch adds a .bdrv_join_options callback to BlockDriver and
implements it for qcow2.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2015-12-18 14:34:42 +01:00
Eric Blake 7fb1cf1606 qapi: Don't let implicit enum MAX member collide
Now that we guarantee the user doesn't have any enum values
beginning with a single underscore, we can use that for our
own purposes.  Renaming ENUM_MAX to ENUM__MAX makes it obvious
that the sentinel is generated.

This patch was mostly generated by applying a temporary patch:

|diff --git a/scripts/qapi.py b/scripts/qapi.py
|index e6d014b..b862ec9 100644
|--- a/scripts/qapi.py
|+++ b/scripts/qapi.py
|@@ -1570,6 +1570,7 @@ const char *const %(c_name)s_lookup[] = {
|     max_index = c_enum_const(name, 'MAX', prefix)
|     ret += mcgen('''
|     [%(max_index)s] = NULL,
|+// %(max_index)s
| };
| ''',
|                max_index=max_index)

then running:

$ cat qapi-{types,event}.c tests/test-qapi-types.c |
    sed -n 's,^// \(.*\)MAX,s|\1MAX|\1_MAX|g,p' > list
$ git grep -l _MAX | xargs sed -i -f list

The only things not generated are the changes in scripts/qapi.py.

Rejecting enum members named 'MAX' is now useless, and will be dropped
in the next patch.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1447836791-369-23-git-send-email-eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[Rebased to current master, commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-12-17 08:21:28 +01:00
Eric Blake 6a8f9661dc block: Convert to new qapi union layout
We have two issues with our qapi union layout:
1) Even though the QMP wire format spells the tag 'type', the
C code spells it 'kind', requiring some hacks in the generator.
2) The C struct uses an anonymous union, which places all tag
values in the same namespace as all non-variant members. This
leads to spurious collisions if a tag value matches a non-variant
member's name.

Make the conversion to the new layout for block-related code.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1445898903-12082-16-git-send-email-eblake@redhat.com>
[Commit message tweaked slightly]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-11-02 08:30:27 +01:00
Kevin Wolf 760e006384 block: Convert bs->backing_hd to BdrvChild
This is the final step in converting all of the BlockDriverState
pointers that block drivers use to BdrvChild.

After this patch, bs->children contains the full list of child nodes
that are referenced by a given BDS, and these children are only
referenced through BdrvChild, so that updating the pointer in there is
enough for changing edges in the graph.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:29 +02:00
Kevin Wolf 9a4f4c3156 block: Convert bs->file to BdrvChild
This patch removes the temporary duplication between bs->file and
bs->file_child by converting everything to BdrvChild.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-16 15:34:29 +02:00
Kevin Wolf 5b0959a7d4 qcow2: Support updating driver-specific options in reopen
For updating the cache sizes, disabling lazy refcounts and updating the
clean_cache_timer there is a bit more to do than just changing the
variables, but otherwise we're all set for changing options during
bdrv_reopen().

Just implement the missing pieces and hook the functions up in
bdrv_reopen().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-09-14 16:51:37 +02:00
Kevin Wolf ee55b17304 qcow2: Make qcow2_update_options() suitable for transactions
Before we can allow updating options at runtime with bdrv_reopen(), we
need to split the function into prepare/commit/abort parts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-09-14 16:51:37 +02:00
Kevin Wolf c1344ded70 qcow2: Fix memory leak in qcow2_update_options() error path
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-09-14 16:51:36 +02:00
Kevin Wolf 007dbc396c qcow2: Leave s unchanged on qcow2_update_options() failure
On return, either all new options should be applied to BDRVQcowState (on
success), or all of the old settings should be preserved (on failure).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-09-14 16:51:36 +02:00
Kevin Wolf 94edf3fbe8 qcow2: Move rest of option handling to qcow2_update_options()
With this commit, the handling of driver-specific options in
qcow2_open() is completely separated out into qcow2_update_options().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-09-14 16:51:36 +02:00
Kevin Wolf 90efa0eaef qcow2: Move qcow2_update_options() call up
qcow2_update_options() only updates some variables in BDRVQcowState and
doesn't really depend on other parts of it being initialised yet, so it
can be moved so that it immediately follows the other half of option
handling code in qcow2_open().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-09-14 16:51:36 +02:00
Kevin Wolf 4c75d1a157 qcow2: Factor out qcow2_update_options()
Eventually we want to be able to change options at runtime. As a first
step towards that goal, separate some option handling code from the
general initialisation code in qcow2_open().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-09-14 16:51:36 +02:00
Kevin Wolf f113ae839e qcow2: Improve error message
Eric says that "any" sounds better than "either", and my non-native
feeling says the same, so let's change it.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-09-14 16:51:36 +02:00
Kevin Wolf ff99129ab8 qcow2: Rename BDRVQcowState to BDRVQcow2State
BDRVQcowState is already used by qcow1, and gdb is always confused which
one to use. Rename the qcow2 one so they can be distinguished.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2015-09-14 16:51:36 +02:00
Max Reitz 6ebf9aa2ef block: Drop drv parameter from bdrv_open()
Now that this parameter is effectively unused, we can drop it and just
pass NULL on to bdrv_open_inherit().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-09-14 16:51:36 +02:00
Max Reitz e6641719fe block: Always pass NULL as drv for bdrv_open()
Change all callers of bdrv_open() to pass the driver name in the options
QDict instead of passing its BlockDriver pointer.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-09-14 16:51:36 +02:00
Alberto Garcia 279621c046 qcow2: add option to clean unused cache entries after some time
This adds a new 'cache-clean-interval' option that cleans all qcow2
cache entries that haven't been used in a certain interval, given in
seconds.

This allows setting a large L2 cache size so it can handle scenarios
with lots of I/O and at the same time use little memory during periods
of inactivity.

This feature currently relies on MADV_DONTNEED to free that memory, so
it is not useful in systems that don't follow that behavior.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: a70d12da60433df9360ada648b3f34b8f6f354ce.1438690126.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2015-09-04 21:00:32 +02:00
Daniel P. Berrange f6fa64f6d2 block: convert qcow/qcow2 to use generic cipher API
Switch the qcow/qcow2 block driver over to use the generic cipher
API, this allows it to use the pluggable AES implementations,
instead of being hardcoded to use QEMU's built-in impl.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1435770638-25715-10-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-08 13:11:01 +02:00
Daniel P. Berrange 6f2945cde6 crypto: move built-in AES implementation into crypto/
To prepare for a generic internal cipher API, move the
built-in AES implementation into the crypto/ directory

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1435770638-25715-3-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-07 12:04:13 +02:00
Markus Armbruster c6bd8c706a qerror: Clean up QERR_ macros to expand into a single string
These macros expand into error class enumeration constant, comma,
string.  Unclean.  Has been that way since commit 13f59ae.

The error class is always ERROR_CLASS_GENERIC_ERROR since the previous
commit.

Clean up as follows:

* Prepend every use of a QERR_ macro by ERROR_CLASS_GENERIC_ERROR, and
  delete it from the QERR_ macro.  No change after preprocessing.

* Rewrite error_set(ERROR_CLASS_GENERIC_ERROR, ...) into
  error_setg(...).  Again, no change after preprocessing.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
2015-06-22 18:20:40 +02:00
Max Reitz bc85ef265a qcow2: Add DEFAULT_L2_CACHE_CLUSTERS
If a relatively large cluster size is chosen, the default of 1 MB L2
cache is not really appropriate. In this case, unless overridden by the
user, the default cache size should not be determined by its size in
bytes but by the number of L2 tables (clusters) it is supposed to
contain.

Note that without this patch, MIN_L2_CACHE_SIZE will effectively take
over the same role. However, providing space for just two L2 tables is
not enough to be the default.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-06-12 15:54:01 +02:00
Daniel P. Berrange 8336aafae1 qcow2/qcow: protect against uninitialized encryption key
When a qcow[2] file is opened, if the header reports an
encryption method, this is used to set the 'crypt_method_header'
field on the BDRVQcow[2]State struct, and the 'encrypted' flag
in the BDRVState struct.

When doing I/O operations, the 'crypt_method' field on the
BDRVQcow[2]State struct is checked to determine if encryption
needs to be applied.

The crypt_method_header value is copied into crypt_method when
the bdrv_set_key() method is called.

The QEMU code which opens a block device is expected to always
do a check

   if (bdrv_is_encrypted(bs)) {
       bdrv_set_key(bs, ....key...);
   }

If code forgets to do this, then 'crypt_method' is never set
and so when I/O is performed, QEMU writes plain text data
into a sector which is expected to contain cipher text, or
when reading, will return cipher text instead of plain
text.

Change the qcow[2] code to consult bs->encrypted when deciding
whether encryption is required, and assert(s->crypt_method)
to protect against cases where the caller forgets to set the
encryption key.

Also put an assert in the set_key methods to protect against
the case where the caller sets an encryption key on a block
device that does not have encryption

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-05-22 17:08:01 +02:00
Alberto Garcia dc881b441d block: add 'node-name' field to BLOCK_IMAGE_CORRUPTED
Since this event can occur in nodes that cannot have a device name
associated, include also a field with the node name.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 147cec5b3594f4bec0cb41c98afe5fcbfb67567c.1428485266.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Alberto Garcia 81e5f78a9f block: use bdrv_get_device_or_node_name() in error messages
There are several error messages that identify a BlockDriverState by
its device name. However those errors can be produced in nodes that
don't have a device name associated.

In those cases we should use bdrv_get_device_or_node_name() to fall
back to the node name and produce a more meaningful message. The
messages are also updated to use the more generic term 'node' instead
of 'device'.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 9823a1f0514fdb0692e92868661c38a9e00a12d6.1428485266.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Stefan Hajnoczi 786a4ea82e Convert (ffs(val) - 1) to ctz32(val)
This commit was generated mechanically by coccinelle from the following
semantic patch:

@@
expression val;
@@
- (ffs(val) - 1)
+ ctz32(val)

The call sites have been audited to ensure the ffs(0) - 1 == -1 case
never occurs (due to input validation, asserts, etc).  Therefore we
don't need to worry about the fact that ctz32(0) == 32.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-5-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Kevin Wolf e4603fe139 qcow2: Fix header update with overridden backing file
In recent qemu versions, it is possible to override the backing file
name and format that is stored in the image file with values given at
runtime. In such cases, the temporary override could end up in the
image header if the qcow2 header was updated, while obviously correct
behaviour would be to leave the on-disk backing file path/format
unchanged.

Fix this and add a test case for it.

Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1428411796-2852-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-04-08 10:29:20 +01:00
Wen Congyang 87b86e7ef2 qcow2: fix the macro QCOW_MAX_L1_SIZE's use
QCOW_MAX_L1_SIZE's unit is byte, and l1_size's unit
is l1 table entry size(8 bytes).

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Message-id: 54FFB0F1.5010307@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-03-12 17:41:23 +00:00
Max Reitz 06d05fa738 qcow2: Allow creation with refcount order != 4
Add a creation option to qcow2 for setting the refcount order of images
to be created, and respect that option's value.

This breaks some test outputs, fix them.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-03-10 14:02:21 +01:00
Max Reitz 8a17b83cc3 qcow2: Use symbolic macros in qcow2_amend_options
qcow2_amend_options() should not compare options against some inline
strings but rather use the symbolic macros available for each of the
creation options.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-03-10 14:02:21 +01:00
Max Reitz bd4b167f84 qcow2: refcount_order parameter for qcow2_create2
Add a refcount_order parameter to qcow2_create2(), use that value for
the image header and for calculating the size required for
preallocation.

For now, always pass 4.

This addition requires changes to the calculation of the file size for
the "full" and "falloc" preallocation modes. That in turn is a nice
opportunity to add a comment about that calculation not necessarily
being exact (and that being intentional).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-03-10 14:02:21 +01:00
Max Reitz b72faf9f78 qcow2: Open images with refcount order != 4
No longer refuse to open images with a different refcount entry width
than 16 bits; only reject images with a refcount width larger than 64
bits (which is prohibited by the specification).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-03-10 14:02:21 +01:00
Max Reitz 0709c5a153 qcow2: Add refcount_bits to format-specific info
Add the bit width of every refcount entry to the format-specific
information.

In contrast to lazy_refcounts and the corrupt flag, this should be
always emitted, even for compat=0.10 although it does not support any
refcount width other than 16 bits. This is because if a boolean is
optional, one normally assumes it to be false when omitted; but if an
integer is not specified, it is rather difficult to guess its value.

This new field breaks some test outputs, fix them.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-03-10 14:02:20 +01:00
Max Reitz 346a53df38 qcow2: Add two new fields to BDRVQcowState
Add two new fields regarding refcount information (the bit width of
every entry and the maximum refcount value) to the BDRVQcowState.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-03-10 14:02:20 +01:00
Markus Armbruster f43e47dbf6 QemuOpts: Drop qemu_opt_set(), rename qemu_opt_set_err(), fix use
qemu_opt_set() is a wrapper around qemu_opt_set() that reports the
error with qerror_report_err().

Most of its users assume the function can't fail.  Make them use
qemu_opt_set_err() with &error_abort, so that should the assumption
ever break, it'll break noisily.

Just two users remain, in util/qemu-config.c.  Switch them to
qemu_opt_set_err() as well, then rename qemu_opt_set_err() to
qemu_opt_set().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2015-02-26 14:49:31 +01:00
Markus Armbruster 39101f2511 QemuOpts: Convert qemu_opt_set_number() to Error, fix its use
Return the Error object instead of reporting it with
qerror_report_err().

Change callers that assume the function can't fail to pass
&error_abort, so that should the assumption ever break, it'll break
noisily.

Turns out all callers outside its unit test assume that.  We could
drop the Error ** argument, but that would make the interface less
regular, so don't.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2015-02-26 14:47:32 +01:00
Max Reitz c0191e763b block: Remove "growable" from BDS
Now that request clamping is done in the BlockBackend, the "growable"
field can be removed from the BlockDriverState. All BDSs are now treated
as being "growable" (that is, they are allowed to grow; they are not
necessarily actually able to).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1423162705-32065-16-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-02-16 15:07:19 +00:00
Jeff Cody e729fa6afe block: fix off-by-one error in qcow and qcow2
This fixes an off-by-one error introduced in 9a29e18.  Both qcow and
qcow2 need to make sure to leave room for string terminator '\0' for
the backing file, so the max length of the non-terminated string is
either 1023 or PATH_MAX - 1.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-02-06 17:24:21 +01:00
Jeff Cody 9a29e18f7d block: update string sizes for filename,backing_file,exact_filename
The string field entries 'filename', 'backing_file', and
'exact_filename' in the BlockDriverState struct are defined as 1024
bytes.

However, many places that use these values accept a maximum of PATH_MAX
bytes, so we have a mixture of 1024 byte and PATH_MAX byte allocations.
This patch makes the BlockDriverStruct field string sizes match usage.

This patch also does a few fixes related to the size that needs to
happen now:

    * the block qapi driver is updated to use PATH_MAX bytes
    * the qcow and qcow2 drivers have an additional safety check
    * the block vvfat driver is updated to use PATH_MAX bytes
      for the size of backing_file, for systems where PATH_MAX is < 1024
      bytes.
    * qemu-img uses PATH_MAX rather than 1024.  These instances were not
      changed to be dynamically allocated, however, as the extra
      temporary 3K in stack usage for qemu-img does not seem worrisome.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-01-23 18:17:06 +01:00
Max Reitz 6a69b9620a qcow2: Respect bdrv_truncate() error
bdrv_truncate() may fail and qcow2_write_compressed() should return the
error code in that case.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-12-10 10:31:20 +01:00
Max Reitz 3b5e14c76a qcow2: Flushing the caches in qcow2_close may fail
qcow2_cache_flush() may fail; if one of the caches failed to be flushed
successfully to disk in qcow2_close() the image should not be marked
clean, and we should emit a warning.

This breaks the (qcow2-specific) iotests 026, 071 and 089; change their
output accordingly.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-12-10 10:31:20 +01:00
Max Reitz ef8104378c block: Omit bdrv_find_format for essential drivers
We can always assume raw, file and qcow2 being available; so do not use
bdrv_find_format() to locate their BlockDriver objects but statically
reference the respective objects.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-12-10 10:31:19 +01:00
Max Reitz 5f535a941e block: Make essential BlockDriver objects public
There are some block drivers which are essential to QEMU and may not be
removed: These are raw, file and qcow2 (as the default non-raw format).
Make their BlockDriver objects public so they can be directly referenced
throughout the block layer without needing to call bdrv_find_format()
and having to deal with an error at runtime, while the real problem
occurred during linking (where raw, file or qcow2 were not linked into
qemu).

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-12-10 10:31:19 +01:00
Kevin Wolf 2ebafc854d qcow2: Fix header extension size check
After reading the extension header, offset is incremented, but not
checked against end_offset any more. This way an integer overflow could
happen when checking whether the extension end is within the allowed
range, effectively disabling the check.

This patch adds the missing check and a test case for it.

Cc: qemu-stable@nongnu.org
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416935562-7760-2-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-12-10 10:31:13 +01:00
Max Reitz 4057a2b24a block/qcow2: Implement status CB for amend
The only really time-consuming operation potentially performed by
qcow2_amend_options() is zero cluster expansion when downgrading qcow2
images from compat=1.1 to compat=0.10, so report status of that
operation and that operation only through the status CB.

For this, approximate the progress as the number of L1 entries visited
during the operation.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Message-id: 1414404776-4919-5-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:49 +00:00
Max Reitz 7748543420 block: Add status callback to bdrv_amend_options()
Depending on the changed options and the image format,
bdrv_amend_options() may take a significant amount of time. In these
cases, a way to be informed about the operation's status is desirable.

Since the operation is rather complex and may fundamentally change the
image, implementing it as AIO or a coroutine does not seem feasible. On
the other hand, implementing it as a block job would be significantly
more difficult than a simple callback and would not add benefits other
than progress report to the amending operation, because it should not
actually be run as a block job at all.

A callback may not be very pretty, but it's very easy to implement and
perfectly fits its purpose here.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414404776-4919-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz 94054183da qcow2: Optimize bdrv_make_empty()
bdrv_make_empty() is currently only called if the current image
represents an external snapshot that has been committed to its base
image; it is therefore unlikely to have internal snapshots. In this
case, bdrv_make_empty() can be greatly sped up by emptying the L1 and
refcount table (while having the dirty flag set, which only works for
compat=1.1) and creating a trivial refcount structure.

If there are snapshots or for compat=0.10, fall back to the simple
implementation (discard all clusters).

[Applied s/clusters/cluster/ typo fix suggested by Eric Blake
--Stefan]

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1414159063-25977-4-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz 491d27e2af qcow2: Implement bdrv_make_empty()
Implement this function by making all clusters in the image file fall
through to the backing file (by using the recently extended discard).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414159063-25977-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:48 +00:00
Max Reitz 808c4b6f30 qcow2: Allow "full" discard
Normally, discarded sectors should read back as zero. However, there are
cases in which a sector (or rather cluster) should be discarded as if
they were never written in the first place, that is, reading them should
fall through to the backing file again.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1414159063-25977-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-11-03 11:41:47 +00:00
Max Reitz 17bd5f4727 qcow2: Drop REFCOUNT_SHIFT
With BDRVQcowState.refcount_block_bits, we don't need REFCOUNT_SHIFT
anymore.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:02 +02:00
Max Reitz 5b84106bd9 qcow2: Fix leaks in dirty images
When opening dirty images, qcow2's repair function should not only
repair errors but leaks as well.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Max Reitz 1d13d65466 qcow2: Calculate refcount block entry count
The size of a refblock entry is (in theory) variable; calculate
therefore the number of entries per refblock and the according bit shift
(1 << x == entry count) when opening an image.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-23 15:34:01 +02:00
Markus Armbruster bfb197e0d9 block: Eliminate BlockDriverState member device_name[]
device_name[] can become non-empty only in bdrv_new_root() and
bdrv_move_feature_fields().  The latter is used only to undo damage
done by bdrv_swap().  The former is called only by blk_new_with_bs().
Therefore, when a BlockDriverState's device_name[] is non-empty, then
it's been created with a BlockBackend, and vice versa.  Furthermore,
blk_new_with_bs() keeps the two names equal.

Therefore, device_name[] is redundant.  Eliminate it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:26 +02:00
Max Reitz 9009b1963c qapi: Add corrupt field to ImageInfoSpecificQCow2
Just like lazy-refcounts, this field will be present iff the qcow2
compat level is 1.1 (or probably any future revision).

As expected, this breaks some tests due to the new field present in
qemu-img info output; so fix their output accordingly.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1412105489-7681-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-10-04 19:18:17 +01:00
Max Reitz ee42b5ce0b qcow2: Add overlap-check.template option
Being able to set the overlap-check option to a string and then refine
it via the overlap-check.* options is a nice idea for the command line
but does not work so well for non-flattened dicts. In that case, one can
only specify either but not both, so add a field to overlap-check.*
which does the same as directly specifying overlap-check but can be used
in conjunction with the other fields in non-flattened dicts.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1408557576-14574-4-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:34 +01:00
Max Reitz 7b17ce60cc qcow2: Fix leak of QemuOpts in qcow2_open()
Currently, the QemuOpts object opts is leaked if anything fails from its
creation up to and including the image repair block. Fix this by freeing
that object in the fail path.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Message-id: 1408557576-14574-2-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:32 +01:00
Max Reitz 85186ebdac qcow2: Add qcow2_signal_corruption()
Add a helper function for easily marking an image corrupt (on fatal
corruptions) while outputting an informative message to stderr and via
QAPI.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Message-id: 1409926039-29044-3-git-send-email-mreitz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-09-22 11:39:27 +01:00
Hu Tao 0e4271b711 qcow2: Add falloc and full preallocation option
preallocation=falloc allocates disk space by posix_fallocate(),
preallocation=full allocates disk space by writing zeros to disk.
Both modes imply preallocation=metadata.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Hu Tao ffeaac9b4e qapi: introduce PreallocMode and new PreallocModes full and falloc.
This patch prepares for the subsequent patches.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Hu Tao 180e95265e block: don't convert file size to sector size
and avoid converting it back later.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Hu Tao c2eb918e32 block: round up file size to nearest sector
Currently the file size requested by user is rounded down to nearest
sector, causing the actual file size could be a bit less than the size
user requested. Since some formats (like qcow2) record virtual disk
size in bytes, this can make the last few bytes cannot be accessed.

This patch fixes it by rounding up file size to nearest sector so that
the actual file size is no less than the requested file size.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-09-12 15:43:06 +02:00
Max Reitz 6c1c8d5d3e qcow2: Add runtime options for cache sizes
Add options for specifying the size of the metadata caches. This can
either be done directly for each cache (if only one is given, the other
will be derived according to a default ratio) or combined for both.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Max Reitz 440ba08aea qcow2: Constant cache size in bytes
Specifying the metadata cache sizes in clusters results in less clusters
(and much less bytes) covered for small cluster sizes and vice versa.
Using a constant byte size reduces this difference, and makes it
possible to manually specify the cache size in an easily comprehensible
unit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-08-20 11:51:28 +02:00
Kevin Wolf de82815db1 qcow2: Handle failure for potentially large allocations
Some code in the block layer makes potentially huge allocations. Failure
is not completely unexpected there, so avoid aborting qemu and handle
out-of-memory situations gracefully.

This patch addresses the allocations in the qcow2 block driver.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:15 +02:00
Markus Armbruster 75d3d21f9e block: Drop superfluous aligning of bdrv_getlength()'s value
It returns a multiple of the sector size.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Markus Armbruster 57322b7811 block: Use bdrv_nb_sectors() where sectors, not bytes are wanted
Instead of bdrv_getlength().

Aside: a few of these callers don't handle errors.  I didn't
investigate whether they should.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-08-15 15:07:13 +02:00
Kevin Wolf 3baca89139 block: Add Error argument to bdrv_refresh_limits()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18 13:18:43 +01:00
Kevin Wolf 12ac6d3db7 qcow2: Fix error path for unknown incompatible features
qcow2's report_unsupported_feature() had two bugs: A 32 bit truncation
would prevent feature table entries for bits 32-63 from being used, and
it could assign errp multiple times if there was more than one unknown
feature, resulting in an error_set() assertion failure.

Fix the truncation, make sure to set the error exactly once and add a
qemu-iotests case for it.

This fixes https://bugs.launchpad.net/qemu/+bug/1342704/

Reported-by: Maria Kustova <maria.k@catit.be>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-07-18 13:12:15 +01:00
Kevin Wolf 44deba5a52 qcow2: Make qiov match request size until backing file EOF
If a qcow2 image has a shorter backing file and a read request to
unallocated clusters goes across EOF of the backing file, the backing
file sees a shortened request and the rest is filled with zeros.
However, the original too long qiov was used with the shortened request.

This patch makes the qiov size match the request size, avoiding a
potential buffer overflow in raw-posix.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2014-07-14 12:03:20 +02:00
Kevin Wolf 8ee79e707a block: Catch backing files assigned to non-COW drivers
Since we parse backing.* options to add a backing file from the command
line when the driver didn't assign one, it has been possible to have a
backing file for e.g. raw images (it just was never accessed).

This is obvious nonsense and should be rejected.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2014-06-26 13:51:01 +02:00
Chunyan Liu c282e1fdf7 cleanup QEMUOptionParameter
Now that all backend drivers are using QemuOpts, remove all
QEMUOptionParameter related codes.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu 1bd0e2d1c4 qcow2.c: replace QEMUOptionParameter with QemuOpts
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:21 +08:00
Chunyan Liu 83d0521a1e change block layer to support both QemuOpts and QEMUOptionParamter
Change block layer to support both QemuOpts and QEMUOptionParameter.
After this patch, it will change backend drivers one by one. At the end,
QEMUOptionParameter will be removed and only QemuOpts is kept.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-16 17:23:20 +08:00
Markus Armbruster a1904e48c4 qcow2: Plug memory leak on qcow2_invalidate_cache() error paths
Introduced in commit 5a8a30d.  Spotted by Coverity.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-05-30 14:26:54 +02:00
Max Reitz 521b2b5df0 block: Use correct width in format strings
Instead of blindly relying on a normal integer having a width of 32 bits
(which is a pretty good assumption, but we should not rely on it if
there is no need), use the correct format string macros.

This does not touch DEBUG output.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-30 14:46:17 +02:00
Kevin Wolf 4c2e5f8f46 qcow2: Flush metadata during read-only reopen
If lazy refcounts are enabled for a backing file, committing to this
backing file may leave it in a dirty state even if the commit succeeds.
The reason is that the bdrv_flush() call in bdrv_commit() doesn't flush
refcount updates with lazy refcounts enabled, and qcow2_reopen_prepare()
doesn't take care to flush metadata.

In order to fix this, this patch also fixes qcow2_mark_clean(), which
contains another ineffective bdrv_flush() call beause lazy refcounts are
disabled only afterwards. All existing callers of qcow2_mark_clean()
either don't modify refcounts or already flush manually, so that this
fixes only a latent, but not yet actually triggerable bug.

Another instance of the same problem is live snapshots. Again, a real
corruption is prevented by an explicit flush for non-read-only images in
external_snapshot_prepare(), but images using lazy refcounts stay dirty.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-04 14:12:26 +02:00
Stefan Hajnoczi c792707f54 qcow2: link all L2 meta updates in preallocate()
preallocate() only links the first QCowL2Meta's data clusters into the
L2 table and ignores any chained QCowL2Metas in the linked list.

Chains of QCowL2Meta structs are built up when contiguous clusters span
L2 tables.  Each QCowL2Meta describes one L2 table update.  This is a
rare case in preallocate() but can happen.

This patch fixes preallocate() by iterating over the whole list of
QCowL2Metas.  Compare with the qcow2_co_writev() function's
implementation, which is similar but also also handles request
dependencies.  preallocate() only performs one allocation at a time so
there can be no dependencies.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf 6a83f8b5be qcow2: Check maximum L1 size in qcow2_snapshot_load_tmp() (CVE-2014-0143)
This avoids an unbounded allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf 11b128f406 qcow2: Fix NULL dereference in qcow2_open() error path (CVE-2014-0146)
The qcow2 code assumes that s->snapshots is non-NULL if s->nb_snapshots
!= 0. By having the initialisation of both fields separated in
qcow2_open(), any error occuring in between would cause the error path
to dereference NULL in qcow2_free_snapshots() if the image had any
snapshots.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:35 +02:00
Kevin Wolf 2b5d5953ee qcow2: Check new refcount table size on growth
If the size becomes larger than what qcow2_open() would accept, fail the
growing operation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:22:34 +02:00
Kevin Wolf b106ad9185 qcow2: Don't rely on free_cluster_index in alloc_refcount_block() (CVE-2014-0147)
free_cluster_index is only correct if update_refcount() was called from
an allocation function, and even there it's brittle because it's used to
protect unfinished allocations which still have a refcount of 0 - if it
moves in the wrong place, the unfinished allocation can be corrupted.

So not using it any more seems to be a good idea. Instead, use the
first requested cluster to do the calculations. Return -EAGAIN if
unfinished allocations could become invalid and let the caller restart
its search for some free clusters.

The context of creating a snapsnot is one situation where
update_refcount() is called outside of a cluster allocation. For this
case, the change fixes a buffer overflow if a cluster is referenced in
an L2 table that cannot be represented by an existing refcount block.
(new_table[refcount_table_index] was out of bounds)

[Bump the qemu-iotests 026 refblock_alloc.write leak count from 10 to
11.
--Stefan]

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 15:21:03 +02:00
Kevin Wolf 6d33e8e7dc qcow2: Fix backing file name length check
len could become negative and would pass the check then. Nothing bad
happened because bdrv_pread() happens to return an error for negative
length values, but make variables for sizes unsigned anyway.

This patch also changes the behaviour to error out on invalid lengths
instead of silently truncating it to 1023.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf 2d51c32c4b qcow2: Validate active L1 table offset and size (CVE-2014-0144)
This avoids an unbounded allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf ce48f2f441 qcow2: Validate snapshot table offset/size (CVE-2014-0144)
This avoid unbounded memory allocation and fixes a potential buffer
overflow on 32 bit hosts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf 8c7de28305 qcow2: Validate refcount table offset
The end of the refcount table must not exceed INT64_MAX so that integer
overflows are avoided.

Also check for misaligned refcount table. Such images are invalid and
probably the result of data corruption. Error out to avoid further
corruption.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf 5dab2faddc qcow2: Check refcount table size (CVE-2014-0144)
Limit the in-memory reference count table size to 8 MB, it's enough in
practice. This fixes an unbounded allocation as well as a buffer
overflow in qcow2_refcount_init().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf a1b3955c94 qcow2: Check backing_file_offset (CVE-2014-0144)
Header, header extension and the backing file name must all be stored in
the first cluster. Setting the backing file to a much higher value
allowed header extensions to become much bigger than we want them to be
(unbounded allocation).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Kevin Wolf 24342f2cae qcow2: Check header_length (CVE-2014-0144)
This fixes an unbounded allocation for s->unknown_header_fields.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 14:19:09 +02:00
Prasad Joshi c5a33ee9ee qcow2: fix two memory leaks in qcow2_open error code path
Signed-off-by: Prasad Joshi <prasadjoshi.linux@gmail.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-04-01 13:49:53 +02:00
Kevin Wolf 5a8a30db47 block: Add error handling to bdrv_invalidate_cache()
If it returns an error, the migrated VM will not be started, but qemu
exits with an error message.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-03-19 09:39:41 +01:00
Kevin Wolf 27eb6c097c qcow2: Don't write with BDRV_O_INCOMING
qcow2_open() causes writes when repairing an image with the dirty flag
set and when clearing autoclear flags. It shouldn't do this when another
qemu instance is still actively working on this image file.

One effect of the bug is that images may have a cleared dirty flag while
the migration source host still has it in use with lazy refcounts
enabled, so refcounts are not accurate and the dirty flag must remain
set.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:42:24 +01:00
Kevin Wolf d475e5acd2 qcow2: Keep option in qcow2_invalidate_cache()
Instead of manually building a list of all options from BDRVQcowState
values just reuse the options that were used to open the image.
qcow2_open() won't fully use all of the options in the QDict, but that's
okay.

This fixes all of the driver-specific options in qcow2, except for
lazy-refcounts, which was special cased before.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:42:24 +01:00
Kevin Wolf 3456a8d185 block: Update image size in bdrv_invalidate_cache()
After migration has completed, we call bdrv_invalidate_cache() so that
drivers which cache some data drop their stale copy of the data and
reread it from the image file to get a new version of data that the
source modified while the migration was running.

Reloading metadata from the image file is useless, though, if the size
of the image file stays stale (this is a value that is cached for all
image formats in block.c). Reads from (meta)data after the old EOF
return only zeroes, causing image corruption.

We need to update bs->total_sectors in all layers that could potentially
have changed their size (i.e. backing files are not a concern - if they
are changed, we're in bigger trouble)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-03-13 14:23:27 +01:00
Paolo Bonzini 76abe4071d block: do not abuse EMEDIUMTYPE
Returning "Wrong medium type" for an image that does not have a valid
header is a bit weird.  Improve the error by mentioning what format
was trying to open it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:24 +01:00
Max Reitz 2e40134bfd block: Make bdrv_file_open() static
Add the bdrv_open() option BDRV_O_PROTOCOL which results in passing the
call to bdrv_file_open(). Additionally, make bdrv_file_open() static and
therefore bdrv_open() the only way to call it.

Consequently, all existing calls to bdrv_file_open() have to be adjusted
to use bdrv_open() with the BDRV_O_PROTOCOL flag instead.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:22 +01:00
Max Reitz ddf5636dc9 block: Add reference parameter to bdrv_open()
Allow bdrv_open() to handle references to existing block devices just as
bdrv_file_open() is already capable of.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:22 +01:00
Max Reitz f67503e5bd block: Change BDS parameter of bdrv_open() to **
Make bdrv_open() take a pointer to a BDS pointer, similarly to
bdrv_file_open(). If a pointer to a NULL pointer is given, bdrv_open()
will create a new BDS with an empty name; if the BDS pointer is not
NULL, that existing BDS will be reused (in the same way as bdrv_open()
already did).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-21 21:02:21 +01:00
Markus Armbruster 84d18f065f Use error_is_set() only when necessary
error_is_set(&var) is the same as var != NULL, but it takes
whole-program analysis to figure that out.  Unnecessarily hard for
optimizers, static checkers, and human readers.  Dumb it down to
obvious.

Gets rid of several dozen Coverity false positives.

Note that the obvious form is already used in many places.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-02-17 11:57:23 -05:00
Hu Tao 7c2bbf4aa6 qcow2: check for NULL l2meta
In the case of a metadata preallocation with a large cluster size,
qcow2_alloc_cluster_offset() can allocate nothing and returns a
NULL l2meta. This patch checks for it and link2 l2 with only valid
l2meta.

Replace 9 and 512 with BDRV_SECTOR_BITS, BDRV_SECTOR_SIZE
respectively while at the function.

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-09 09:12:39 +01:00
Hu Tao 16f0587e0a qcow2: remove n_start and n_end of qcow2_alloc_cluster_offset()
n_start can be actually calculated from offset. The number of
sectors to be allocated(n_end - n_start) can be passed in in
num. By removing n_start and n_end, we can save two parameters.

The side effect is there is a bug in qcow2.c:preallocate() that
passes incorrect n_start to qcow2_alloc_cluster_offset() is
fixed. The bug can be triggerred by a larger cluster size than
the default value(65536), for example:

./qemu-img create -f qcow2 \
  -o 'cluster_size=131072,preallocation=metadata' file.img 4G

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-02-09 09:12:39 +01:00
Jeff Cody 14b4a8b9c6 block: remove qcow2 .bdrv_make_empty implementation
The QCOW2 .bdrv_make_empty implementation always returns 0 for success,
but does not actually do anything.

The proper way to not support an optional driver function stub is to
just not implement it, so let's remove the stub.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-01-31 22:05:03 +01:00
Kevin Wolf d34682cd4a block: Move initialisation of BlockLimits to bdrv_refresh_limits()
This function separates filling the BlockLimits from bdrv_open(), which
allows it to call it from other operations which may change the limits
(e.g. modifications to the backing file chain or bdrv_reopen)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2014-01-24 17:40:01 +01:00
Max Reitz 72daa72eee block: Allow reference for bdrv_file_open()
Allow specifying a reference to an existing block device (by name) for
bdrv_file_open() instead of a filename and/or options.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-01-22 12:07:17 +01:00
Peter Crosthwaite 87ea75d5e1 qemu-option: Remove qemu_opts_create_nofail
This is a boiler-plate _nofail variant of qemu_opts_create. Remove and
use error_abort in call sites.

null/0 arguments needs to be added for the id and fail_if_exists fields
in affected callsites due to argument inconsistency between the normal and
no_fail variants.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2014-01-06 15:02:30 -05:00
Kevin Wolf f8413b3c23 qcow2: Zero-initialise first cluster for new images
Strictly speaking, this is only required for has_zero_init() == false,
but it's easy enough to just do a cluster-aligned write that is padded
with zeros after the header.

This fixes that after 'qemu-img create' header extensions are attempted
to be parsed that are really just random leftover data.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-04 11:29:37 +01:00
Paolo Bonzini cffb1ec600 block drivers: expose requirement for write same alignment from formats
This will let misaligned but large requests use zero clusters.  This
is important because the cluster size is not guest visible.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Paolo Bonzini 95de6d7078 block drivers: add discard/write_zeroes properties to bdrv_get_info implementation
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-12-03 15:26:49 +01:00
Kevin Wolf c9fbb99d41 block: Use BDRV_O_NO_BACKING where appropriate
If you open an image temporarily just because you want to check its size
or get it flushed, there's no real reason to open the whole backing file
chain.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
2013-11-29 17:41:09 +01:00
Peter Lieven aa7bfbfff7 block: add flags to bdrv_*_write_zeroes
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Max Reitz ba2ab2f2ca qcow2: Flush image after creation
Opening the qcow2 image with BDRV_O_NO_FLUSH prevents any flushes during
the image creation. This means that the image has not yet been flushed
to disk when qemu-img create exits. This flush is delayed until the next
operation on the image involving opening it without BDRV_O_NO_FLUSH and
closing (or directly flushing) it. For large images and/or images with a
small cluster size and preallocated metadata, this flush may take a
significant amount of time and may occur unexpectedly.

Reopening the image without BDRV_O_NO_FLUSH right before the end of
qcow2_create2() results in hoisting the potentially costly flush into
the image creation, which is expected to take some time (whereas
successive image operations may be not).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-28 17:34:32 +01:00
Max Reitz 6e13610aa4 qcow2: Unset zero_beyond_eof in save_vmstate
Saving the VM state is done using bdrv_pwrite. This function may perform
a read-modify-write, which in this case results in data being read from
beyond the end of the virtual disk. Since we are actually trying to
access an area which is not a part of the virtual disk, zero_beyond_eof
has to be set to false before performing the partial write, otherwise
the VM state may become corrupted.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-24 11:50:51 +02:00
Max Reitz eedff66f21 qcow2: Restore total_sectors value in save_vmstate
Since df2a6f29a5, bdrv_co_do_writev increases the total_sectors value of
a growable block devices on writes after the current end. This leads to
the virtual disk apparently growing in qcow2_save_vmstate, which in turn
affects the disk size captured by the internal snapshot taken directly
afterwards through e.g. the HMP savevm command. Such a "grown" snapshot
cannot be loaded after reopening the qcow2 image, since its disk size
differs from the actual virtual disk size (writing a VM state does not
actually increase the virtual disk size).

Fix this by restoring total_sectors at the end of qcow2_save_vmstate.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-24 11:45:06 +02:00
Max Reitz 1fa5cc839a qcow2: Evaluate overlap check options
Evaluate the runtime overlap check options and set
BDRVQcowState.overlap_check appropriately.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz 4092e99d93 qcow2: Array assigning options to OL check bits
Add an array which assigns the option string to its corresponding
overlap check bit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz 05de7e86ca qcow2: Add overlap-check options
Add runtime options to tune the overlap checks to be performed before
write accesses.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz 3e3553905c qcow2: Make overlap check mask variable
Replace the QCOW2_OL_DEFAULT macro by a variable overlap_check in
BDRVQcowState.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz 231bb26764 qcow2: Use negated overflow check mask
In qcow2_check_metadata_overlap and qcow2_pre_write_overlap_check,
change the parameter signifying the checks to perform from its current
positive form to a negative one, i.e., it will no longer explicitly
specify every check to perform but rather a mask of checks not to
perform.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:50:00 +02:00
Max Reitz f9bff97143 qcow2: Remove wrong metadata overlap check
In qcow2_write_compressed, if the compression fails, a normal cluster is
written to disk. This is done through bdrv_write on the qcow2 BDS
itself (using the guest offset), thus it is wrong to do a metadata
overlap check before.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz 9e3f08923a qcow2: Add missing space in error message
The error message in qcow2_downgrade about an unsupported refcount
order is missing a space. This patch adds it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 16:49:59 +02:00
Max Reitz 37764dfb71 qcow2: Add support for ImageInfoSpecific
Add a new ImageInfoSpecificQCow2 type as a subtype of ImageInfoSpecific.
This contains the compatibility level as a string and an optional
lazy_refcounts boolean (optional means mandatory for compat >= 1.1 and
not available for compat == 0.10).

Also, add qcow2_get_specific_info, which returns this information.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-10-11 14:03:57 +02:00
Jeff Cody c4217f645d block: qcow2 - used QEMU_PACKED for on-disk structures
QCowHeader and QCowExtension are structs that reside in the on-disk
image format, and are read and written directly via bdrv_pread()/write(),
and as such should be packed to avoid any unintentional struct padding.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-25 20:51:13 +02:00
Max Reitz 3ef6c40ad0 qcow2: Use Error parameter
Employ usage of the new Error ** parameter in qcow2_open, qcow2_create
and associated functions.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz cc84d90ff5 block: Error parameter for create functions
Add an Error ** parameter to bdrv_create and its associated functions to
allow more specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz 34b5d2c68e block: Error parameter for open functions
Add an Error ** parameter to bdrv_open, bdrv_file_open and associated
functions to allow more specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz d5124c00d8 bdrv: Use "Error" for creating images
Add an Error ** parameter to BlockDriver.bdrv_create to allow more
specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:48 +02:00
Max Reitz 015a1036a7 bdrv: Use "Error" for opening images
Add an Error ** parameter to BlockDriver.bdrv_open and
BlockDriver.bdrv_file_open to allow more specific error messages.

Signed-off-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:47 +02:00
Max Reitz 9296b3ed70 qcow2: Implement bdrv_amend_options
Implement bdrv_amend_options for compat, size, backing_file, backing_fmt
and lazy_refcounts.

Downgrading images from compat=1.1 to compat=0.10 is achieved through
handling all incompatible flags accordingly, clearing all compatible and
autoclear flags and expanding all zero clusters.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Max Reitz b6481f376b qcow2: Save refcount order in BDRVQcowState
Save the image refcount order in BDRVQcowState. This will be relevant
for future code supporting different refcount orders than four and also
for code that needs to verify a certain refcount order for an opened
image.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-12 10:12:46 +02:00
Kevin Wolf 1ebf561c11 qcow2: Discard VM state in active L1 after creating snapshot
During savevm, the VM state is written to the active L1 of the image and
then a snapshot is taken. After that, the VM state isn't needed any more
in the active L1 and should be discarded. This is implemented by this
patch.

The impact of not discarding the VM state is that a snapshot can never
become smaller than any previous snapshot (because it would be padded
with old VM state), and more importantly that future savevm operations
cause unnecessary COWs (with associated flushes), which makes subsequent
snapshots much slower.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:46 +02:00
Kevin Wolf 670df5e3b4 qcow2: Pass discard type to qcow2_discard_clusters()
The function will be used internally instead of only being called for
guest discard requests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2013-09-12 10:12:46 +02:00
Paolo Bonzini 4bc74be997 block: return get_block_status data and flags for formats
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini b6b8a33354 block: introduce bdrv_get_block_status API
For now, bdrv_get_block_status is just another name for bdrv_is_allocated.
The next patches will add more flags.

This also touches all block drivers with a mostly mechanical rename.  The
sole exception is cow; because it calls cow_co_is_allocated from the read
code, we keep that function and make cow_co_get_block_status a wrapper.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Paolo Bonzini d663640c04 block: expect errors from bdrv_co_is_allocated
Some bdrv_is_allocated callers do not expect errors, but the fallback
in qcow2.c might make other callers trip on assertion failures or
infinite loops.

Fix the callers to always look for errors.

Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:09 +02:00
Fam Zheng 4f6fd3491c block: make bdrv_delete() static
Manage BlockDriverState lifecycle with refcnt, so bdrv_delete() is no
longer public and should be called by bdrv_unref() if refcnt is
decreased to 0.

This is an identical change because effectively, there's no multiple
reference of BDS now: no caller of bdrv_ref() yet, only bdrv_new() sets
bs->refcnt to 1, so all bdrv_unref() now actually delete the BDS.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Max Reitz 24530f3e06 qcow2_check: Mark image consistent
If no corruptions remain after an image repair (and no errors have been
encountered), clear the corrupt flag in qcow2_check.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-09-02 10:15:15 +02:00
Max Reitz cf93980e77 qcow2: Employ metadata overlap checks
The pre-write overlap check function is now called before most of the
qcow2 writes (aborting it on collision or other error).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:43 +02:00
Max Reitz 69c9872653 qcow2: Add corrupt bit
This adds an incompatible bit indicating corruption to qcow2. Any image
with this bit set may not be written to unless for repairing (and
subsequently clearing the bit if the repair has been successful).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-08-30 15:48:43 +02:00
Kevin Wolf 9117b47717 qcow2: Change default for new images to compat=1.1
By the time that qemu 1.7 will be released, enough time will have passed
since qemu 1.1, which is the first version to understand version 3
images, that changing the default shouldn't hurt many people any more
and the benefits of using the new format outweigh the pain.

qemu-iotests already runs with compat=1.1 by default.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-08-30 15:28:51 +02:00
Asias He 0d51b4debe block: Introduce bs->zero_beyond_eof
In 4146b46c42e0989cb5842e04d88ab6ccb1713a48 (block: Produce zeros when
protocols reading beyond end of file), we break qemu-iotests ./check
-qcow2 022. This happens because qcow2 temporarily sets ->growable = 1
for vmstate accesses (which are stored beyond the end of regular image
data).

We introduce the bs->zero_beyond_eof to allow qcow2_load_vmstate() to
disable ->zero_beyond_eof temporarily in addition to enable ->growable.

[Since the broken patch "block: Produce zeros when protocols reading
beyond end of file" has not been merged yet, I have applied this fix
*first* and will then apply the next patch to keep the tree bisectable.
-- Stefan]

Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Asias He <asias@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-22 14:10:21 +02:00
Kevin Wolf 8ad1898cf1 qcow2: Change default for new images to compat=1.1
By the time that qemu 1.7 will be released, enough time will have passed
since qemu 1.1, which is the first version to understand version 3
images, that changing the default shouldn't hurt many people any more
and the benefits of using the new format outweigh the pain.

qemu-iotests already runs with compat=1.1 by default.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-08-21 14:41:09 +02:00
Kevin Wolf 64aa99d3e0 qcow2: Use dashes instead of underscores in options
This is what QMP wants to use. The options haven't been enabled in any
release yet, so we're still free to change them.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-07-26 21:59:56 +02:00
Peter Lieven 3ac216270a block: change default of .has_zero_init to 0
.has_zero_init defaults to 1 for all formats and protocols.

this is a dangerous default since this means that all
new added drivers need to manually overwrite it to 0 if
they do not ensure that a device is zero initialized
after bdrv_create().

if a driver needs to explicitly set this value to
1 its easier to verify the correctness in the review process.

during review of the existing drivers it turned out
that ssh and gluster had a wrong default of 1.
both protocols support host_devices as backend
which are not by default zero initialized. this
wrong assumption will lead to possible corruption
if qemu-img convert is used to write to such a backend.

vpc and vmdk also defaulted to 1 altough they support
fixed respectively flat extends. this has to be addresses
in separate patches. both formats as well as the mentioned
ssh and gluster are turned to the default of 0 with this
patch for safety.

a similar problem with the wrong default existed for
iscsi most likely because the driver developer did
oversee the default value of 1.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-06-28 13:52:35 +02:00
Kevin Wolf 0b919fae31 qcow2: Batch discards
This optimises the discard operation for freed clusters by batching
discard requests (both snapshot deletion and bdrv_discard end up
updating the refcounts cluster by cluster).

Note that we don't discard asynchronously, but keep s->lock held. This
is to avoid that a freed cluster is reallocated and written to while the
discard is still in flight.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-24 10:25:17 +02:00
Kevin Wolf 67af674e47 qcow2: Options to enable discard for freed clusters
Deleted snapshots are discarded in the image file by default, discard
requests take their default from the -drive discard=... option and other
places that free clusters must always be enabled explicitly.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-24 10:25:17 +02:00
Kevin Wolf 6cfcb9b8b9 qcow2: Add refcount update reason to all callers
This adds a refcount update reason to all callers of update_refcounts(),
so that a follow-up patch can use this information to decide whether
clusters that reach a refcount of 0 should be discarded in the image
file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-06-24 10:25:17 +02:00
Kevin Wolf 2cf7cfa1cd qcow2: Catch some L1 table index overflows
This catches the situation that is described in the bug report at
https://bugs.launchpad.net/qemu/+bug/865518 and goes like this:

    $ qemu-img create -f qcow2 huge.qcow2 $((1024*1024))T
    Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
    $ qemu-io /tmp/huge.qcow2 -c "write $((1024*1024*1024*1024*1024*1024 - 1024)) 512"
    Segmentation fault

With this patch applied the segfault will be avoided, however the case
will still fail, though gracefully:

    $ qemu-img create -f qcow2 /tmp/huge.qcow2 $((1024*1024))T
    Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
    qemu-img: The image size is too large for file format 'qcow2'

Note that even long before these overflow checks kick in, you get
insanely high memory usage (up to INT_MAX * sizeof(uint64_t) = 16 GB for
the L1 table), so with somewhat smaller image sizes you'll probably see
qemu aborting for a failed g_malloc().

If you need huge image sizes, you should increase the cluster size to
the maximum of 2 MB in order to get higher limits.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-05-14 16:44:33 +02:00
Stefan Hajnoczi f4d38bef7c qcow2: allow sub-cluster compressed write to last cluster
Compression in qcow2 requires image length to be a multiple of the
cluster size.  Lift this requirement by zero-padding the final cluster
when necessary.  The virtual disk size is still not cluster-aligned, so
the guest cannot access the zero sectors.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-22 10:27:58 +02:00
Kevin Wolf 8d3b1a2d0b block: Introduce bdrv_pwritev() for qcow2_save_vmstate
Directly pass the QEMUIOVector on instead of linearising it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 08:26:18 +02:00
Kevin Wolf cf8074b382 block: Introduce bdrv_writev_vmstate
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 08:26:18 +02:00
Aurelien Jarno 753d9b82c5 aes: move aes.h from include/block to include/qemu
Move aes.h from include/block to include/qemu to show it can be reused
by other subsystems.

Cc: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-13 13:51:57 +02:00
Kevin Wolf 88c6588c51 qcow2: Allow requests with multiple l2metas
Instead of expecting a single l2meta, have a list of them. This allows
to still have a single I/O request for the guest data, even though
multiple l2meta may be needed in order to describe both a COW overwrite
and a new cluster allocation (typical sequential write case).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf 9ee6439e27 qcow2: Remove bogus unlock of s->lock
The unlock wakes up the next coroutine, but the currently running
coroutine will lock it again before it yields, so this doesn't make a
lot of sense.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:42 +01:00
Kevin Wolf 787e4a8500 block: Add options QDict to bdrv_file_open() prototypes
The new parameter is unused yet.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:31 +01:00
Kevin Wolf acdfb480ba qcow2: Fix segfault in qcow2_invalidate_cache
Need to pass an options QDict to qcow2_open() now. This fixes a segfault
on the migration target with qcow2.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-19 11:48:36 +01:00
Paolo Bonzini 381b487d54 qcow2: make is_allocated return true for zero clusters
Otherwise, live migration of the top layer will miss zero clusters and
let the backing file show through.  This also matches what is done in qed.

QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files.  Check this
directly in qcow2_get_cluster_offset instead of replicating the test
everywhere.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:50 +01:00
Kevin Wolf 74c4510a3c qcow2: Allow lazy refcounts to be enabled on the command line
qcow2 images now accept a boolean lazy_refcounts options. Use it like
this:

  -drive file=test.qcow2,lazy_refcounts=on

If the option is specified on the command line, it overrides the default
specified by the qcow2 header flags that were set when creating the
image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:49 +01:00
Kevin Wolf de9c0cec6c block: Add options QDict to bdrv_open() prototype
It doesn't do anything yet except storing the options QDict in the
BlockDriverState.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:49 +01:00
Kevin Wolf 1a86938f04 block: Add options QDict to .bdrv_open()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-15 16:07:49 +01:00
Stefan Weil 15bac0d54f block: Use error code EMEDIUMTYPE for wrong format in some block drivers
This improves error reports for bochs, cow, qcow, qcow2, qed and vmdk
when a file with the wrong format is selected.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:35 +01:00
Kevin Wolf 8d2497c355 qcow2: Fix segfault on zero-length write
One of the recent refactoring patches (commit f50f88b9) didn't take care
to initialise l2meta properly, so with zero-length writes, which don't
even enter the write loop, qemu just segfaulted.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-01-15 09:08:55 +01:00
Paolo Bonzini 1de7afc984 misc: move include files to include/qemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:39 +01:00
Paolo Bonzini 737e150e89 block: move include files to include/block/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Paolo Bonzini 7b1b5d1913 qapi: move include files to include/qobject/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Kevin Wolf 4e95314e2b qcow2: Execute run_dependent_requests() without lock
There's no reason for run_dependent_requests() to hold s->lock, and a
later patch will require that in fact the lock is not held.

Also, before this patch, run_dependent_requests() not only does what its
name suggests, but also removes the l2meta from the list of in-flight
requests. When changing this, it becomes an one-liner, so just inline it
completely.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf 280d373579 qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2
This is closer to where the dirty flag is really needed, and it avoids
having checks for special cases related to cluster allocation directly
in the writev loop.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf f50f88b9fe qcow2: Allocate l2meta only for cluster allocations
Even for writes to already allocated clusters, an l2meta is allocated,
though it stays effectively unused. After this patch, only allocating
requests still have one. Each l2meta now describes an in-flight request
that writes to clusters that are not yet hooked up in the L2 table.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf 060bee8943 qcow2: Drop l2meta.cluster_offset
There's no real reason to have an l2meta for normal requests that don't
allocate anything. Before we can get rid of it, we must return the host
cluster offset in a different way.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf cf5c1a231e qcow2: Allocate l2meta dynamically
As soon as delayed COW is introduced, the l2meta struct is needed even
after completion of the request, so it can't live on the stack.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf 67a7a0ebe5 qcow2: Move BLKDBG_EVENT out of the lock
We want to use these events to suspend requests for testing concurrent
AIO requests. Suspending requests while they are holding the CoMutex is
rather boring for this purpose.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-12 12:33:48 +01:00
Jim Meyering 00ea188125 qcow2: mark this file's sole strncpy use as justified
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-05 07:58:38 -05:00
Jeff Cody 21d82ac95f block: qcow2 image file reopen
These are the stubs for the file reopen drivers for the qcow2 format.

There is currently nothing that needs to be done by the qcow2 driver
in reopen.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24 15:15:12 +02:00
Stefan Hajnoczi 058f8f16db block: add BLOCK_O_CHECK for qemu-img check
Image formats with a dirty bit, like qed and qcow2, repair dirty image
files upon open with BDRV_O_RDWR.  Performing automatic repair when
qemu-img check runs is not ideal because the bdrv_open() call repairs
the image before the actual bdrv_check() call from qemu-img.c.

Fix this "double repair" since it leads to confusing output from
qemu-img check.  Tell the block driver that this image is being opened
just for bdrv_check().  This skips automatic repair and qemu-img.c can
invoke it manually with bdrv_check().

Update the golden output for qemu-iotests 039 to reflect the new
qemu-img check output.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10 10:25:12 +02:00
Stefan Hajnoczi acbe59829e qcow2: mark image clean after repair succeeds
The dirty bit is cleared after image repair succeeds in qcow2_open().
Move this into qcow2_check() so that all callers benefit from this
behavior when fix mode is enabled.

This is necessary so qemu-img check can call .bdrv_check() and mark the
image clean.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10 10:25:12 +02:00
Stefan Hajnoczi bfe8043e92 qcow2: implement lazy refcounts
Lazy refcounts is a performance optimization for qcow2 that postpones
refcount metadata updates and instead marks the image dirty.  In the
case of crash or power failure the image will be left in a dirty state
and repaired next time it is opened.

Reducing metadata I/O is important for cache=writethrough and
cache=directsync because these modes guarantee that data is on disk
after each write (hence we cannot take advantage of caching updates in
RAM).  Refcount metadata is not needed for guest->file block address
translation and therefore does not need to be on-disk at the time of
write completion - this is the motivation behind the lazy refcount
optimization.

The lazy refcount optimization must be enabled at image creation time:

  qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G
  qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough

Update qemu-iotests 031 and 036 since the extension header size changes
when we add feature bit table entries.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-06 22:39:14 +02:00
Stefan Hajnoczi c61d0004bc qcow2: introduce dirty bit
This patch adds an incompatible feature bit to mark images that have not
been closed cleanly.  When a dirty image file is opened a consistency
check and repair is performed.

Update qemu-iotests 031 and 036 since the extension header size changes
when we add feature bit table entries.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-06 22:39:14 +02:00
Anthony Liguori 23797df3d9 Merge remote-tracking branch 'mjt/mjt-iov2' into staging
* mjt/mjt-iov2:
  rewrite iov_send_recv() and move it to iov.c
  cleanup qemu_co_sendv(), qemu_co_recvv() and friends
  export iov_send_recv() and use it in iov_send() and iov_recv()
  rename qemu_sendv to iov_send, change proto and move declarations to iov.h
  change qemu_iovec_to_buf() to match other to,from_buf functions
  consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent
  allow qemu_iovec_from_buffer() to specify offset from which to start copying
  consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset()
  rewrite iov_* functions
  change iov_* function prototypes to be more appropriate
  virtio-serial-bus: use correct lengths in control_out() message

Conflicts:
	tests/Makefile

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-07-09 12:35:06 -05:00
Stefan Hajnoczi b35278f754 qcow2: fix #ifdef'd qcow2_check_refcounts() callers
The DEBUG_ALLOC qcow2.h macro enables additional consistency checks
throughout the code.  This makes it easier to spot corruptions that are
introduced during development.  Since consistency check is an expensive
operation the DEBUG_ALLOC macro is used to compile checks out in normal
builds and qcow2_check_refcounts() calls missed the addition of a new
function argument.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09 15:53:01 +02:00
Stefan Hajnoczi af7b708db2 qcow2: fix autoclear image header update
The autoclear feature bits can be used for qcow2 file format features
that are safe to "drop" by old programs that do not understand the
feature.  Upon opening the image file unknown autoclear feature bits are
cleared and the image file header is rewritten, but this was happening
too early in the code when critical header fields were not yet loaded.

Process autoclear feature bits after all necessary header information
has been loaded.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15 14:03:43 +02:00
Paolo Bonzini 6af4e9ead4 qcow2: always operate caches in writeback mode
Writethrough does not need special-casing anymore in the qcow2 caches.
The block layer adds flushes after every guest-initiated data write,
and these will also flush the qcow2 caches to the OS.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15 14:03:43 +02:00
Kevin Wolf 166acf546f qcow2: Support for fixing refcount inconsistencies
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15 14:03:42 +02:00
Kevin Wolf 4534ff5426 qemu-img check -r for repairing images
The QED block driver already provides the functionality to not only
detect inconsistencies in images, but also fix them. However, this
functionality cannot be manually invoked with qemu-img, but the
check happens only automatically during bdrv_open().

This adds a -r switch to qemu-img check that allows manual invocation
of an image repair.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15 14:03:42 +02:00
Michael Tokarev d5e6b1619c change qemu_iovec_to_buf() to match other to,from_buf functions
It now allows specifying offset within qiov to start from and
amount of bytes to copy.  Actual implementation is just a call
to iov_to_buf().

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11 23:12:11 +04:00
Michael Tokarev 1b093c480a consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent
qemu_iovec_concat() is currently a wrapper for
qemu_iovec_copy(), use the former (with extra
"0" arg) in a few places where it is used.

Change skip argument of qemu_iovec_copy() from
uint64_t to size_t, since size of qiov itself
is size_t, so there's no way to skip larger
sizes.  Rename it to soffset, to make it clear
that the offset is applied to src.

Also change the only usage of uint64_t in
hw/9pfs/virtio-9p.c, in v9fs_init_qiov_from_pdu() -
all callers of it actually uses size_t too,
not uint64_t.

One added restriction: as for all other iovec-related
functions, soffset must point inside src.

Order of argumens is already good:
 qemu_iovec_memset(QEMUIOVector *qiov, size_t offset,
                   int c, size_t bytes)
vs:
 qemu_iovec_concat(QEMUIOVector *dst,
                   QEMUIOVector *src,
                   size_t soffset, size_t sbytes)
(note soffset is after _src_ not dst, since it applies to src;
for memset it applies to qiov).

Note that in many places where this function is used,
the previous call is qemu_iovec_reset(), which means
many callers actually want copy (replacing dst content),
not concat.  So we may want to add a wrapper like
qemu_iovec_copy() with the same arguments but which
calls qemu_iovec_reset() before _concat().

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11 23:12:11 +04:00
Michael Tokarev 03396148bc allow qemu_iovec_from_buffer() to specify offset from which to start copying
Similar to
 qemu_iovec_memset(QEMUIOVector *qiov, size_t offset,
                   int c, size_t bytes);
the new prototype is:
 qemu_iovec_from_buf(QEMUIOVector *qiov, size_t offset,
                     const void *buf, size_t bytes);

The processing starts at offset bytes within qiov.

This way, we may copy a bounce buffer directly to
a middle of qiov.

This is exactly the same function as iov_from_buf() from
iov.c, so use the existing implementation and rename it
to qemu_iovec_from_buf() to be shorter and to match the
utility function.

As with utility implementation, we now assert that the
offset is inside actual iovec.  Nothing changed for
current callers, because `offset' parameter is new.

While at it, stop using "bounce-qiov" in block/qcow2.c
and copy decrypted data directly from cluster_data
instead of recreating a temp qiov for doing that.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11 23:12:11 +04:00
Michael Tokarev 3d9b49254f consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset()
This patch combines two functions into one, and replaces
the implementation with already existing iov_memset() from
iov.c.

The new prototype of qemu_iovec_memset():
  size_t qemu_iovec_memset(qiov, size_t offset, int fillc, size_t bytes)
It is different from former qemu_iovec_memset_skip(), and
I want to make other functions to be consistent with it
too: first how much to skip, second what, and 3rd how many
of it.  It also returns actual number of bytes filled in,
which may be less than the requested `bytes' if qiov is
smaller than offset+bytes, in the same way iov_memset()
does.

While at it, use utility function iov_memset() from
iov.h in posix-aio-compat.c, where qiov was used.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11 23:07:44 +04:00
Jim Meyering b6c147622d qcow2: don't leak buffer for unexpected qcow_version in header
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-25 18:12:54 +02:00
Kevin Wolf c44bfe4637 qcow2: Don't ignore failure to clear autoclear flags
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-14 17:02:19 +02:00
Paolo Bonzini 5f3777945d block: push bdrv_change_backing_file error checking up from drivers
This check applies to all drivers, but QED lacks it.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10 10:32:11 +02:00
Zhi Yong Wu 15552c4ad3 qcow2: lock on prealloc
preallocate() will be locked. This is required because
qcow2_alloc_cluster_link_l2() assumes that it runs under a lock that it
can drop while COW is being performed.

Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-07 19:33:18 +02:00
Stefan Weil b9531b6eed block/qcow2: Add missing GCC_FMT_ATTR to function report_unsupported()
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-02 18:39:39 +02:00
Kevin Wolf 621f058940 qcow2: Zero write support
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:30 +02:00
Kevin Wolf cfcc4c62ff qcow2: Support for feature table header extension
Instead of printing an ugly bitmask, qemu can now print a more helpful
string even for yet unknown features.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:29 +02:00
Kevin Wolf 6377af48b0 qcow2: Support reading zero clusters
This adds support for reading zero clusters in version 3 images.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:29 +02:00
Kevin Wolf 6744cbab8c qcow2: Version 3 images
This adds the basic infrastructure to qcow2 to handle version 3 images.
It includes code to create v3 images, allow header updates for v3 images
and checks feature bits.

It still misses support for zero clusters, so this is not a fully
compliant implementation of v3 yet.

The default for creating new images stays at v2 for now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:29 +02:00
Kevin Wolf 68d000a390 qcow2: Ignore reserved bits in get_cluster_offset
With this change, reading from a qcow2 image ignores all reserved bits
that are set in an L1 or L2 table entry.

Now get_cluster_offset() assigns *cluster_offset only the offset without
any other flags. The cluster type is not longer encoded in the offset,
but a positive return value in case of success.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:27 +02:00
Paolo Bonzini 29cdb2513c block: push recursive flushing up from drivers
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05 14:54:39 +02:00
Kevin Wolf 259b217310 qcow2: Add error messages in qcow2_truncate
qemu-img resize has some limitations with qcow2, but the user is only
told that "this image format does not support resize". Quite confusing,
so add some more detailed error_report() calls and change "this image
format" into "this image".

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-03-12 15:14:06 +01:00
Kevin Wolf 3cce16f44d qcow2: Add some tracing
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-03-12 15:14:06 +01:00
Kevin Wolf 64ca6aee4f qcow2: Reject too large header extensions
Image files that make qemu-img info read several gigabytes into the
unknown header extensions list are bad. Just fail opening the image
if an extension claims to be larger than the header extension area.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-02-29 12:48:47 +01:00
Kevin Wolf fd29b4bbef qcow2: Fix offset in qcow2_read_extensions
The spec says that the length of extensions is padded to 8 bytes, not
the offset. Currently this is the same because the header size is a
multiple of 8, so this is only about compatibility with future changes
to the header size.

While touching it, move the calculation to a common place instead of
duplicating it for each header extension type.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-02-29 12:48:47 +01:00
Kevin Wolf 423477e556 qcow2: Fix build with DEBUG_EXT enabled
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-29 12:48:47 +01:00
Kevin Wolf 75bab85ca0 qcow2: Keep unknown header extension when rewriting header
If we want header extensions to work as compatible extensions, we can't
destroy yet unknown header extensions when rewriting the header (e.g.
for changing the backing file). Save all unknown header extensions in a
list of blobs and include them in a new header.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-09 16:17:51 +01:00
Kevin Wolf e24e49e619 qcow2: Update whole header at once
In order to switch the backing file, qcow2 issues multiple write
requests that only changed a part of the image header. Any failure after
the first one would leave the header in an corrupted state. With this
patch, the whole header is written at once, so we can't fail in the
middle.

At the same time, this gives us a reusable functions that updates all
fields of the qcow2 header and not only the backing file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-09 16:17:51 +01:00
Li Zhi Hui 28c1202ba6 block/qcow2.c: call qcow2_free_snapshots in the function of qcow2_close
Signed-off-by: Li Zhi Hui <zhihuili@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-15 12:40:08 +01:00
Anthony Liguori eb5d5beaeb Merge remote-tracking branch 'kwolf/for-anthony' into staging 2011-12-05 09:39:25 -06:00
Stefan Hajnoczi e8ee5e4c47 coroutine: add qemu_co_queue_restart_all()
It's common to wake up all waiting coroutines.  Introduce the
qemu_co_queue_restart_all() function to do this instead of looping over
qemu_co_queue_next() in every caller.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05 14:51:38 +01:00
Stefan Hajnoczi f8a2e5e3ca block: convert qcow2, qcow2, and vmdk to .bdrv_co_is_allocated()
The qcow2, qcow, and vmdk block drivers are based on coroutines.  They have a
coroutine mutex which protects internal state.  We can convert the
.bdrv_is_allocated() function to .bdrv_co_is_allocated() by holding the mutex
around the cluster lookup operation.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05 14:51:37 +01:00
Kevin Wolf 42deb29fed qcow2: Return real error code in qcow2_read_snapshots
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05 14:51:35 +01:00
Dong Xu Wang a968168c58 block: Add coroutine_fn marker to coroutine functions
Looks better when reviewing these source files.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05 14:51:35 +01:00
Dong Xu Wang 9b2260cbd5 fix spelling in block sub directory
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-02 10:50:57 +00:00
Anthony Liguori 06d9260ffa qcow2: implement bdrv_invalidate_cache (v2)
We don't reopen the actual file, but instead invoke the close and open routines.
We specifically ignore the backing file since it's contents are read-only and
therefore immutable.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-21 14:58:48 -06:00
Kevin Wolf eb489bb1ec block: Introduce bdrv_co_flush_to_os
qcow2 has a writeback metadata cache, so flushing a qcow2 image actually
consists of writing back that cache to the protocol and only then flushes the
protocol in order to get everything stable on disk.

This introduces a separate bdrv_co_flush_to_os to reflect the split.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-11-11 14:02:59 +01:00
Kevin Wolf c68b89acd6 block: Rename bdrv_co_flush to bdrv_co_flush_to_disk
There are two different types of flush that you can do: Flushing one level up
to the OS (i.e. writing data to the host page cache) or flushing it all the way
down to the disk. The existing functions flush to the disk, reflect this in the
function name.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-11-11 14:02:59 +01:00
Dong Xu Wang c95de7e2c4 block: fix qcow2_co_flush deadlock
If qcow2_cache_flush failed, s->lock will not be unlock.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-28 19:25:49 +02:00
Paolo Bonzini 6db39ae2e2 block: change discard to co_discard
Since coroutine operation is now mandatory, convert both bdrv_discard
implementations to coroutines.  For qcow2, this means taking the lock
around the operation.  raw-posix remains synchronous.

The bdrv_discard callback is then unused and can be eliminated.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21 17:34:14 +02:00
Paolo Bonzini 8b94ff8573 block: change flush to co_flush
Since coroutine operation is now mandatory, convert all bdrv_flush
implementations to coroutines.  For qcow2, this means taking the lock.
Other implementations are simpler and just forward bdrv_flush to the
underlying protocol, so they can avoid the lock.

The bdrv_flush callback is then unused and can be eliminated.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21 17:34:14 +02:00
Kevin Wolf 8f1efd00c4 qcow2: Fix bdrv_write_compressed error handling
If during allocation of compressed clusters the cluster was already allocated
uncompressed, fail and properly release the l2_table (the latter avoids a
failed assertion).

While at it, make it return some real error numbers instead of -1.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
2011-10-21 17:34:13 +02:00
Stefan Hajnoczi 6f6dc6565e block: drop redundant bdrv_flush implementation
Block drivers now only need to provide either of .bdrv_co_flush,
.bdrv_aio_flush() or for legacy drivers .bdrv_flush().  Remove
the redundant .bdrv_flush() implementations.

[Paolo Bonzini: change raw driver to bdrv_co_flush]

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-10-21 17:34:13 +02:00
Frediano Ziglio dea43a65d6 qcow2: align cluster_data to block to improve performance using O_DIRECT
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-12 15:17:22 +02:00
Kevin Wolf 0fa9131a44 qcow2: Fix error cases to run depedent requests
Requests depending on a failed request would end up waiting forever. This fixes
the error path to continue dependent requests even when the request has failed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 11:23:51 +02:00
Kevin Wolf 8e217d5384 qcow2: Properly initialise QcowL2Meta
Dependency list pointers filled with random garbage from the stack aren't a
good idea.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-09-06 11:23:51 +02:00
Frediano Ziglio ab0997e0af qcow2: remove memory leak
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:15 +02:00
Frediano Ziglio 3fc48d0983 qcow2: Removed QCowAIOCB entirely
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio 5ebaa27e9a qcow2: reindent and use while before the big jump
prepare to remove read/write callbacks

Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio e78c69b89c qcow2: remove common from QCowAIOCB
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio c2bdd9904b qcow2: remove cluster_offset from QCowAIOCB
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio c227140397 qcow2: remove l2meta from QCowAIOCB
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio faf575c136 qcow2: removed cur_nr_sectors field in QCowAIOCB
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio 4617310c33 qcow2: Removed unused AIOCB fields
Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Frediano Ziglio f5cd8173e7 qcow/qcow2: Allocate QCowAIOCB structure using stack
instead of calling qemi_aio_get use stack

Signed-off-by: Frediano Ziglio <freddy77@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 17:41:14 +02:00
Philipp Hahn 6cbc3031c8 qcow2: Fix DEBUG_* compilation
By introducing BlockDriverState compiling qcow2 with DEBUG_ALLOC and DEBUG_EXT
defined got broken.
Define a BdrvCheckResult structure locally which is now needed as the second
argument.

Also fix qcow2_read_extensions() needing BDRVQcowState.

Signed-off-by: Philipp Hahn <hahn@univention.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-23 14:15:17 +02:00
Anthony Liguori 7267c0947d Use glib memory allocation and free functions
qemu_malloc/qemu_free no longer exist after this commit.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-20 23:01:08 -05:00
Kevin Wolf 68d100e905 qcow2: Use coroutines
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-08-02 15:53:41 +02:00
Markus Armbruster 6daf194dde Strip trailing '\n' from error_report()'s first argument
error_report() prepends location, and appends a newline.  The message
constructed from the arguments should not contain a newline.  Fix the
obvious offenders.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-06-24 09:13:36 +01:00
Kevin Wolf 42496d6240 qcow2: Avoid direct AIO callback
bdrv_aio_* must not call the callback before returning to its caller. In qcow2,
this could happen in some error cases. This starts the real requests processing
in a BH to avoid this situation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-14 17:03:25 +02:00
Kevin Wolf 99cce9fa4e qemu-img create: Fix displayed default cluster size
When not specifying a cluster size on the command line, qemu-img printed
a cluster size of 0:

    Formatting '/tmp/test.qcow2', fmt=qcow2 size=67108864
    encryption=off cluster_size=0

This patch adds the default cluster size to the QEMUOptionParameter list, so
that it displays the default value that is used.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08 11:56:40 +02:00
Christoph Hellwig a659979328 block: clarify the meaning of BDRV_O_NOCACHE
Change BDRV_O_NOCACHE to only imply bypassing the host OS file cache,
but no writeback semantics.  All existing callers are changed to also
specify BDRV_O_CACHE_WB to give them writeback semantics.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-06-08 10:39:32 +02:00
Kevin Wolf e8cdcec123 qcow2: Report error for version > 2
The qcow2 driver is now declared responsible for any QCOW image that has
version 2 or greater (before this, version 3 would be detected as raw).

For everything newer than version 2, an error is reported.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
2011-02-10 13:23:56 +01:00
Kevin Wolf 8af3648843 qcow2: Fix error handling for reading compressed clusters
When reading a compressed cluster failed, qcow2 falsely returned success.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2011-02-10 13:23:44 +01:00
Kevin Wolf 3ab4c7e92d qcow2: Fix error handling for immediate backing file read failure
Requests could return success even though they failed when bdrv_aio_readv
returned NULL for a backing file read.

Reported-by: Chunqiang Tang <ctang@us.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-10 13:23:44 +01:00
Chunqiang Tang e0d9c6f937 QCOW2: bug fix - read base image beyond its size
This patch fixes the following bug in QCOW2. For a QCOW2 image that is larger
than its base image, when handling a read request straddling over the end of the
base image, the QCOW2 driver attempts to read beyond the end of the base image
and the request would fail.

This bug was found by Fast Virtual Disk (FVD)'s fully automated testing tool.
The following test triggered the bug.

dd if=/dev/zero of=/var/ramdisk/truth.raw count=0 bs=1 seek=1098561536
dd if=/dev/zero of=/var/ramdisk/zero-500M.raw count=0 bs=1 seek=593099264
./qemu-img create -f qcow2 -ocluster_size=65536,backing_fmt=blksim -b /var/ramdisk/zero-500M.raw /var/ramdisk/test.qcow2 1098561536
./qemu-io --auto --seed=30477694 --truth=/var/ramdisk/truth.raw --format=qcow2 --test=blksim:/var/ramdisk/test.qcow2 --verify_write=true --compare_before=false --compare_after=true --round=100000 --parallel=100 --io_size=10485760 --fail_prob=0 --cancel_prob=0 --instant_qemubh=true

Signed-off-by: Chunqiang Tang <ctang@us.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-10 13:23:44 +01:00
Kevin Wolf e1a7107f2d qcow2: Really use cache=unsafe for image creation
For cache=unsafe we also need to set BDRV_O_CACHE_WB, otherwise we have some
strange unsafe writethrough mode.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-02-07 09:44:22 +01:00
Kevin Wolf 5ea929e3d1 qcow2: Add bdrv_discard support
This adds a bdrv_discard function to qcow2 that frees the discarded clusters.
It does not yet pass the discard on to the underlying file system driver, but
the space can be reused by future writes to the image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-01-31 10:03:00 +01:00
Kevin Wolf 29c1a7301a qcow2: Use QcowCache
Use the new functions of qcow2-cache.c for everything that works on refcount
block and L2 tables.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-01-24 16:41:49 +01:00
Jes Sorensen 6d85a57e20 Add proper -errno error return values to qcow2_open()
In addition this adds missing braces to the function to be consistent
with the coding style.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-12-17 16:15:04 +01:00
Jes Sorensen 7c80ab3f21 block/qcow2.c: rename qcow_ functions to qcow2_
It doesn't really make sense for functions in qcow2.c to be named
qcow_ so convert the names to match correctly.

Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-12-17 16:15:01 +01:00
Kevin Wolf 205ef7961f block: Allow bdrv_flush to return errors
This changes bdrv_flush to return 0 on success and -errno in case of failure.
It's a requirement for implementing proper error handle in users of bdrv_flush.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2010-11-04 12:52:16 +01:00
edison 51ef67270b Copy snapshots out of QCOW2 disk
In order to backup snapshots, created from QCOW2 iamge, we want to copy snapshots out of QCOW2 disk to a seperate storage.
The following patch adds a new option in "qemu-img": qemu-img convert -f qcow2 -O qcow2 -s snapshot_name src_img bck_img.
Right now, it only supports to copy the full snapshot, delta snapshot is on the way.

Changes from V1: all the comments from Kevin are addressed:
Add read-only checking
Fix coding style
Change the name from bdrv_snapshot_load to bdrv_snapshot_load_tmp

Signed-off-by: Disheng Su <edison@cloud.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-10-22 14:49:35 +02:00
Kevin Wolf 9b036055ef qcow2: Remove old image creation function
They have been #ifdef'd out by the previous patch.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-10-22 14:49:35 +02:00
Kevin Wolf a9420734b6 qcow2: Simplify image creation
Instead of doing lots of magic for setting up initial refcount blocks and stuff
create a minimal (inconsistent) image, open it and initialize the rest with
regular qcow2 functions.

This is a complete rewrite of the image creation function. The old
implementating is #ifdef'd out and will be removed by the next patch (removing
it here would have made the diff unreadable because diff tries to find
similarities when it's really a rewrite)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-10-22 14:49:35 +02:00
Stefan Hajnoczi 72893756e0 qcow2: Support exact L1 table growth
The L1 table grow operation includes a size calculation that bumps up
the new L1 table size in order to anticipate the size needs of vmstate
data.  This helps reduce the number of times that the L1 table has to be
grown when vmstate data is appended.

This size overhead is not necessary during image creation,
bdrv_truncate(), or snapshot goto operations.  In fact, existing
qemu-iotests that exercise table growth are no longer able to trigger it
because image creation preallocates an L1 table that is too large after
changes to qcow_create2().

This patch keeps the size calculation but also adds exact growth for
callers that do not want to inflate the L1 table size unnecessarily.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-10-22 14:49:35 +02:00
Kevin Wolf 6f5f060b73 qcow2: Avoid bounce buffers for AIO write requests
qcow2 used to use bounce buffers for any AIO requests. This does not only imply
unnecessary copying, but also unbounded allocations which should be avoided.

This patch removes bounce buffers from the normal AIO write path. Encrypted
images continue to use a bounce buffer, however with constant size.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21 15:39:43 +02:00
Kevin Wolf bd28f83565 qcow2: Avoid bounce buffers for AIO read requests
qcow2 used to use bounce buffers for any AIO requests. This does not only imply
unnecessary copying, but also unbounded allocations which should be avoided.

This patch removes bounce buffers from the normal AIO read path, and constrains
them to a constant size for encrypted images.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-09-21 15:39:42 +02:00
Kevin Wolf 9ac228e02c qcow2/vdi: Change check to distinguish error cases
This distinguishes between harmless leaks and real corruption. Hopefully users
better understand what qemu-img check wants to tell them.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-06 17:05:49 +02:00
Kevin Wolf 19dbcbf7cc qcow2: Fix error handling during metadata preallocation
People were wondering why qemu-img check failed after they tried to preallocate
a large qcow2 file and ran out of disk space.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-07-02 13:18:01 +02:00
Kevin Wolf 8b3b720620 qcow2: Use bdrv_(p)write_sync for metadata writes
Use bdrv_(p)write_sync to ensure metadata integrity in case of a crash.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-06-22 14:38:02 +02:00
Kevin Wolf 1c46efaa0a qcow2: Allow qcow2_get_cluster_offset to return errors
qcow2_get_cluster_offset() looks up a given virtual disk offset and returns the
offset of the corresponding cluster in the image file. Errors (e.g. L2 table
can't be read) are currenctly indicated by a return value of 0, which is
unfortuately the same as for any unallocated cluster. So in effect we can't
check for errors.

This makes the old return value a by-reference parameter and returns the usual
0/-errno error code.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-28 13:29:11 +02:00
Blue Swirl 0bfcd599e3 Fix %lld or %llx printf format use
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-05-22 08:02:12 +00:00
Kevin Wolf b666d23950 block: Avoid unchecked casts for AIOCBs
Use container_of for one direction and &acb->common for the other one.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-17 10:20:05 +02:00
Kevin Wolf 92b30744d7 qcow2: Remove static forward declaration
OpenBSDs gcc is said to generate warnings for this declaration, so don't
reference bdrv_qcow2 directly, but look it up using bdrv_find_format.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-05-07 17:11:37 +00:00
Kevin Wolf de5f3f40af Revert "Fix OpenBSD build"
This reverts commit 20d97356c9.
The BlockDriver definition should stay at the end of source files.

Conflicts:

	block/qcow2.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-05-07 17:11:02 +00:00
Stefan Hajnoczi 419b19d9b4 qcow2: Implement bdrv_truncate() for growing images
This patch adds the ability to grow qcow2 images in-place using
bdrv_truncate().  This enables qemu-img resize command support for
qcow2.

Snapshots are not supported and bdrv_truncate() will return -ENOTSUP.
The notion of resizing an image with snapshots could lead to confusion:
users may expect snapshots to remain unchanged, but this is not possible
with the current qcow2 on-disk format where the header.size field is
global instead of per-snapshot.  Others may expect snapshots to change
size along with the current image data.  I think it is safest to not
support snapshots and perhaps add behavior later if there is a
consensus.

Backing images continue to work.  If the image is now larger than its
backing image, zeroes are read when accessing beyond the end of the
backing image.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03 10:07:32 +02:00
Kevin Wolf 66f82ceed6 block: Open the underlying image file in generic code
Format drivers shouldn't need to bother with things like file names, but rather
just get an open BlockDriverState for the underlying protocol. This patch
introduces this behaviour for bdrv_open implementation. For protocols which
need to access the filename to open their file/device/connection/... a new
callback bdrv_file_open is introduced which doesn't get an underlying file
opened.

For now, also some of the more obscure formats use bdrv_file_open because they
open() the file themselves instead of using the block.c functions. They need to
be fixed in later patches.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-05-03 10:07:30 +02:00
Blue Swirl 20d97356c9 Fix OpenBSD build
GCC 3.3.5 generates warnings for static forward declarations of data, so
rearrange code to use static forward declarations of functions instead.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2010-04-23 20:19:47 +00:00
Stefan Hajnoczi d4c146f0da qcow2: Use QLIST_FOREACH_SAFE macro
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23 16:21:58 +02:00
Kevin Wolf d6e9098e10 Replace calls of old bdrv_open
What is known today as bdrv_open2 becomes the new bdrv_open. All remaining
callers of the old function are converted to the new one. In some places they
even know the right format, so they should have used bdrv_open2 from the
beginning.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23 16:08:46 +02:00
Kevin Wolf 4768fa902c qcow2: Fix creation of large images
qcow_create2 assumes that the new image will only need one cluster for its
refcount table initially. Obviously that's not true any more when the image is
big enough (exact value depends on the cluster size).

This patch calculates the refcount table size dynamically.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23 16:08:46 +02:00
Kevin Wolf 8252278afb qcow2: Trigger blkdebug events
This adds blkdebug events to qcow2 to allow injecting I/O errors in specific
places.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2010-04-23 16:08:46 +02:00
Kevin Wolf c644db3d53 qcow2: Remove request from in-flight list after error
If we complete a request with a failure we need to remove it from the list of
requests that are in flight. If we don't do it, the next time the same AIOCB is
used for a cluster allocation it will create a loop in the list and qemu will
hang in an endless loop.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-04-10 01:25:30 +02:00
Kevin Wolf 171e3d6b99 qcow2: Don't ignore immediate read/write failures
Returning -EIO is far from optimal, but at least it's an error code.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2010-04-10 01:23:08 +02:00
Juan Quintela bef57da55c qcow2: return errno instead of -1
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-03-09 11:23:00 -06:00
Kevin Wolf 6f745bdaac qcow2: Fix image creation regression
When checking for errors, commit db89119d compares with the wrong values,
failing image creation even when there was no error. Additionally, if an
error has occured, we can't preallocate the image (it's likely broken).

This unbreaks test 023 of qemu-iotests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-02-23 13:23:29 -06:00
Christoph Hellwig 7b88e48ba5 qcow2: rename two QCowAIOCB members
The n member is not very descriptive and very hard to grep, rename it to
cur_nr_sectors to better indicate what it is used for.  Also rename
nb_sectors to remaining_sectors as that is what it is used for.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26 15:45:00 -06:00
Naphtali Sprei 058fc8c768 Ask for read-write permissions when opening files
Found some places that seems needs this explicitly, now that
read-write is not the default.

Signed-off-by: Naphtali Sprei <nsprei@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26 15:42:01 -06:00
Kirill A. Shutemov db89119d40 block/qcow2.c: fix warnings with _FORTIFY_SOURCE
CC    block/qcow2.o
cc1: warnings being treated as errors
block/qcow2.c: In function 'qcow_create2':
block/qcow2.c:829: error: ignoring return value of 'write', declared with attribute warn_unused_result
block/qcow2.c:838: error: ignoring return value of 'write', declared with attribute warn_unused_result
block/qcow2.c:839: error: ignoring return value of 'write', declared with attribute warn_unused_result
block/qcow2.c:841: error: ignoring return value of 'write', declared with attribute warn_unused_result
block/qcow2.c:844: error: ignoring return value of 'write', declared with attribute warn_unused_result
block/qcow2.c:849: error: ignoring return value of 'write', declared with attribute warn_unused_result
block/qcow2.c:852: error: ignoring return value of 'write', declared with attribute warn_unused_result
block/qcow2.c:855: error: ignoring return value of 'write', declared with attribute warn_unused_result
make: *** [block/qcow2.o] Error 1

Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26 14:59:20 -06:00
Kevin Wolf 148da7ea9d qcow2: Return 0/-errno in qcow2_alloc_cluster_offset
Returning 0/-errno allows it to distingush different errors classes. The
cluster offset of newly allocated clusters is now returned in the QCowL2Meta
struct.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26 14:59:19 -06:00
Kevin Wolf 1d36e3aae3 qcow2: Fix error handling in qcow_save_vmstate
Don't assume success but pass the bdrv_pwrite return value on.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-26 14:59:19 -06:00
Kevin Wolf f8012c135e qcow/qcow2: implement bdrv_aio_flush
Now that we do not have to flush the backing device anymore implementing
the bdrv_aio_flush method for image formats is trivial.

[hch: forward ported to qemu mainline from a product tree]

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-13 17:14:15 -06:00
Kevin Wolf 756e6736a1 block: Add bdrv_change_backing_file
Introduce the functions needed to change the backing file of an image. The
function is implemented for qcow2.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2010-01-13 17:14:15 -06:00
Kevin Wolf e1c7f0e3f9 qcow2: Store exact backing format length
Currently qcow2 unnecessarily rounds up the length of the backing format string
to the next multiple of 8. At the same time, the array in BlockDriverState can
only hold 15 characters, so in effect backing formats with 9 characters or more
don't work (e.g. host_device).

Save the real string length and things start to work for all valid image format
names.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-12-03 11:45:49 -06:00
Stefan Weil d191d12d5f qcow2: Allow qcow2 disk images with size zero
Images with disk size 0 may be used for
VM snapshots, but not to save normal block data.

It is possible to create such images using
qemu-img, but opening them later fails.

So even "qemu-img info image.qcow2" is not
possible for an image created with
"qemu-img create -f qcow2 image.qcow2 0".

This is fixed here.

Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-11-09 08:43:01 -06:00
Kevin Wolf 72ecf02d7d Revert "qcow2: Bring synchronous read/write back to life"
It was merely a workaround and the real fix is done now.
This reverts commit ef845c3bf4.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-27 12:28:59 -05:00
Kevin Wolf ef845c3bf4 qcow2: Bring synchronous read/write back to life
When the synchronous read and write functions were dropped, they were replaced
by generic emulation functions. Unfortunately, these emulation functions don't
provide the same semantics as the original functions did.

The original bdrv_read would mean that we read some data synchronously and that
we won't be interrupted during this read. The latter assumption is no longer
true with the emulation function which needs to use qemu_aio_poll and therefore
allows the callback of any other concurrent AIO request to be run during the
read. Which in turn means that (meta)data read earlier could have changed and
be invalid now. qcow2 is not prepared to work in this way and it's just scary
how many places there are where other requests could run.

I'm not sure yet where exactly it breaks, but you'll see breakage with virtio
on qcow2 with a backing file. Providing synchronous functions again fixes the
problem for me.

Patchworks-ID: 35437
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-10-15 09:32:04 -05:00
Blue Swirl 72cf2d4f0e Fix sys-queue.h conflict for good
Problem: Our file sys-queue.h is a copy of the BSD file, but there are
some additions and it's not entirely compatible. Because of that, there have
been conflicts with system headers on BSD systems. Some hacks have been
introduced in the commits 15cc923584,
f40d753718,
96555a96d7 and
3990d09adf but the fixes were fragile.

Solution: Avoid the conflict entirely by renaming the functions and the
file. Revert the previous hacks.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-09-12 07:36:22 +00:00
Kevin Wolf f214978a42 qcow2: Order concurrent AIO requests on the same unallocated cluster
When two AIO requests write to the same cluster, and this cluster is
unallocated, currently both requests allocate a new cluster and the second one
merges the first one when it is completed. This means an cluster allocation, a
read and a cluster deallocation which cause some overhead. If we simply let the
second request wait until the first one is done, we improve overall performance
with AIO requests (specifially, qcow2/virtio combinations).

This patch maintains a list of in-flight requests that have allocated new
clusters. A second request touching the same cluster is limited so that it
either doesn't touch the allocation of the first request (so it can have a
non-overlapping allocation) or it waits for the first request to complete.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-09 17:31:26 -05:00
Kevin Wolf ea80b906f4 qcow2: Fix metadata preallocation
The wrong version of the preallocation patch has been applied, so this is the
remaining diff.

We can't use truncate to grow the image file to the right size because we don't
know if metadata has been written after the last data cluster. In this case
truncate would shrink the file and destroy its metadata. Write a zero sector at
the end of the virtual disk instead to ensure that the file is big enough.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-09-09 17:31:26 -05:00
Blue Swirl 2000cbc50d Fix gcc 3 warning about uninitialized variable
If nb_sectors is 0, cluster_offset will not be initialized.

Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2009-08-29 16:37:26 +03:00
Kevin Wolf a35e1c177d qcow2: Metadata preallocation
This introduces a qemu-img create option for qcow2 which allows the metadata to
be preallocated, i.e. clusters are reserved in the refcount table and L1/L2
tables, but no data is written to them. Metadata is quite small, so this
happens in almost no time.

Especially with qcow2 on virtio this helps to gain a bit of performance during
the initial writes. However, as soon as create a snapshot, we're back to the
normal slow speed, obviously. So this isn't the real fix, but kind of a cheat
while we're still having trouble with qcow2 on virtio.

Note that the option is disabled by default and needs to be specified
explicitly using qemu-img create -f qcow2 -o preallocation=metadata.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-08-27 20:30:20 -05:00
Christoph Hellwig 45566e9c99 replace bdrv_{get, put}_buffer with bdrv_{load, save}_vmstate
The VM state offset is a concept internal to the image format.  Replace
the old bdrv_{get,put}_buffer method that require an index into the
image file that is constructed from the VM state offset and an offset
into the vmstate with the bdrv_{load,save}_vmstate that just take an
offset into the VM state.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-16 08:28:13 -05:00
Kevin Wolf 3f6a3ee51e qcow2: Fix L1 table memory allocation
Contrary to what one could expect, the size of L1 tables is not cluster
aligned. So as we're writing whole sectors now instead of single entries,
we need to ensure that the L1 table in memory is large enough; otherwise
write would access memory after the end of the L1 table.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-10 13:44:29 -05:00
Kevin Wolf 0aa217e461 qcow2: Make cache=writethrough default
The performance of qcow2 has improved meanwhile, so we don't need to
special-case it any more. Switch the default to write-through caching
like all other block drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-07-09 16:06:37 -05:00
Filip Navara 14899cdf3a Fix QCOW2 debugging code to compile again
Updated to use C99 comments.

Signed-off-by: Filip Navara <filip.navara@gmail.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-29 08:52:40 -05:00
Kevin Wolf ed6ccf0f51 qcow2: Rename global functions
The qcow2 source is now split into several more manageable files. During the
conversion quite some functions that were static before needed to be changed to
be global to make the source compile again.

We were lucky enough not to get name conflicts with these additional global
names, but they are not nice. This patch adds a qcow2_ prefix to all of the
global functions in qcow2.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16 15:18:36 -05:00
Kevin Wolf c142442b06 qcow2: Split out snapshot functions
qcow2-snapshot.c contains the code related to snapshotting.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16 15:18:36 -05:00
Kevin Wolf 45aba42fba qcow2: Split out guest cluster functions
qcow2-cluster.c contains all functions related to the management of guest
clusters, i.e. what the guest sees on its virtual disk. This code is about
mapping these guest clusters to host clusters in the image file using the
two-level lookup tables.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16 15:18:36 -05:00
Kevin Wolf f7d0fe0239 qcow2: Split out refcount handling
qcow2-refcount.c contains all functions which are related to cluster
allocation and management in the image file. A large part of this is the
reference counting of these clusters.

Also a header file qcow2.h is introduced which will contain the interface of
the split qcow2 modules.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16 15:18:36 -05:00
Kevin Wolf 9ccb258e28 qcow2: Change default cluster size to 64k
Larger cluster sizes mean less metadata. This has been discussion a few times,
let's do it now. This turns 64k clusters on by default for new images.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-06-16 15:18:36 -05:00
Kevin Wolf db08adf526 qemu-img: Print available options with -o ?
This patch adds a small help text to each of the options in the block drivers
which can be displayed by using qemu-img create -f fmt -o ?

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2009-06-06 18:38:57 +03:00
Christoph Hellwig c16b5a2ca0 fully split aio_pool from BlockDriver
Now that we have a separate aio pool structure we can remove those
aio pool details from BlockDriver.

Every driver supporting AIO now needs to declare a static AIOPool
with the aiocb size and the cancellation method.  This cleans up the
current code considerably and will make it cleaner and more obvious
to support two different aio implementations behind a single
BlockDriver.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-27 09:46:03 -05:00
Kevin Wolf a980c98cf1 qcow2: Update multiple refcounts at once
Don't write each single changed refcount block entry to the disk after it is
written, but update all entries of the block and write all of them at once.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-27 09:45:20 -05:00
Kevin Wolf 44ff42de1c qcow2: Refactor update_refcount
This is a preparation patch with no functional changes. It moves the allocation
of new refcounts block to a new function and makes update_cluster_refcount (for
one cluster) call update_refcount (for multiple clusters) instead the other way
round.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-27 09:45:15 -05:00
Kevin Wolf ade406775d qcow/qcow2: Drop synchronous qcow_write()
There is only one (internal) user left and it can be switched to the normal
emulation provided in block.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-27 09:45:10 -05:00
Kevin Wolf 73c632edc4 qcow2: Allow different cluster sizes
Add an option to specify the cluster size of a newly created qcow2 image.
Default is 4k which is the same value that was hard-coded before.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-22 10:50:32 -05:00
Kevin Wolf 0e7e1989f7 Convert all block drivers to new bdrv_create
Now we can make use of the newly introduced option structures. Instead of
having bdrv_create carry more and more parameters (which are format specific in
most cases), just pass a option structure as defined by the driver itself.

bdrv_create2() contains an emulation of the old interface to simplify the
transition.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-22 10:50:31 -05:00
malc eb0b64f7aa Do not attempt to allocate sn_tab when there are no snapshots
This was caught by a7d27b536f which
aborted on this attempt, thanks to Alex Ivanov for report.

Signed-off-by: malc <av1474@comtv.ru>
2009-05-21 05:40:53 +04:00
Anthony Liguori 019d6b8ff0 Move block drivers into their own directory
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2009-05-14 16:13:46 -05:00