Commit Graph

145 Commits

Author SHA1 Message Date
Paolo Bonzini 737e150e89 block: move include files to include/block/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Paolo Bonzini 7b1b5d1913 qapi: move include files to include/qobject/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Kevin Wolf 4e95314e2b qcow2: Execute run_dependent_requests() without lock
There's no reason for run_dependent_requests() to hold s->lock, and a
later patch will require that in fact the lock is not held.

Also, before this patch, run_dependent_requests() not only does what its
name suggests, but also removes the l2meta from the list of in-flight
requests. When changing this, it becomes an one-liner, so just inline it
completely.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf 280d373579 qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2
This is closer to where the dirty flag is really needed, and it avoids
having checks for special cases related to cluster allocation directly
in the writev loop.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf f50f88b9fe qcow2: Allocate l2meta only for cluster allocations
Even for writes to already allocated clusters, an l2meta is allocated,
though it stays effectively unused. After this patch, only allocating
requests still have one. Each l2meta now describes an in-flight request
that writes to clusters that are not yet hooked up in the L2 table.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf 060bee8943 qcow2: Drop l2meta.cluster_offset
There's no real reason to have an l2meta for normal requests that don't
allocate anything. Before we can get rid of it, we must return the host
cluster offset in a different way.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf cf5c1a231e qcow2: Allocate l2meta dynamically
As soon as delayed COW is introduced, the l2meta struct is needed even
after completion of the request, so it can't live on the stack.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-13 15:37:59 +01:00
Kevin Wolf 67a7a0ebe5 qcow2: Move BLKDBG_EVENT out of the lock
We want to use these events to suspend requests for testing concurrent
AIO requests. Suspending requests while they are holding the CoMutex is
rather boring for this purpose.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-12-12 12:33:48 +01:00
Jim Meyering 00ea188125 qcow2: mark this file's sole strncpy use as justified
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-05 07:58:38 -05:00
Jeff Cody 21d82ac95f block: qcow2 image file reopen
These are the stubs for the file reopen drivers for the qcow2 format.

There is currently nothing that needs to be done by the qcow2 driver
in reopen.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-24 15:15:12 +02:00
Stefan Hajnoczi 058f8f16db block: add BLOCK_O_CHECK for qemu-img check
Image formats with a dirty bit, like qed and qcow2, repair dirty image
files upon open with BDRV_O_RDWR.  Performing automatic repair when
qemu-img check runs is not ideal because the bdrv_open() call repairs
the image before the actual bdrv_check() call from qemu-img.c.

Fix this "double repair" since it leads to confusing output from
qemu-img check.  Tell the block driver that this image is being opened
just for bdrv_check().  This skips automatic repair and qemu-img.c can
invoke it manually with bdrv_check().

Update the golden output for qemu-iotests 039 to reflect the new
qemu-img check output.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10 10:25:12 +02:00
Stefan Hajnoczi acbe59829e qcow2: mark image clean after repair succeeds
The dirty bit is cleared after image repair succeeds in qcow2_open().
Move this into qcow2_check() so that all callers benefit from this
behavior when fix mode is enabled.

This is necessary so qemu-img check can call .bdrv_check() and mark the
image clean.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-10 10:25:12 +02:00
Stefan Hajnoczi bfe8043e92 qcow2: implement lazy refcounts
Lazy refcounts is a performance optimization for qcow2 that postpones
refcount metadata updates and instead marks the image dirty.  In the
case of crash or power failure the image will be left in a dirty state
and repaired next time it is opened.

Reducing metadata I/O is important for cache=writethrough and
cache=directsync because these modes guarantee that data is on disk
after each write (hence we cannot take advantage of caching updates in
RAM).  Refcount metadata is not needed for guest->file block address
translation and therefore does not need to be on-disk at the time of
write completion - this is the motivation behind the lazy refcount
optimization.

The lazy refcount optimization must be enabled at image creation time:

  qemu-img create -f qcow2 -o compat=1.1,lazy_refcounts=on a.qcow2 10G
  qemu-system-x86_64 -drive if=virtio,file=a.qcow2,cache=writethrough

Update qemu-iotests 031 and 036 since the extension header size changes
when we add feature bit table entries.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-06 22:39:14 +02:00
Stefan Hajnoczi c61d0004bc qcow2: introduce dirty bit
This patch adds an incompatible feature bit to mark images that have not
been closed cleanly.  When a dirty image file is opened a consistency
check and repair is performed.

Update qemu-iotests 031 and 036 since the extension header size changes
when we add feature bit table entries.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-08-06 22:39:14 +02:00
Anthony Liguori 23797df3d9 Merge remote-tracking branch 'mjt/mjt-iov2' into staging
* mjt/mjt-iov2:
  rewrite iov_send_recv() and move it to iov.c
  cleanup qemu_co_sendv(), qemu_co_recvv() and friends
  export iov_send_recv() and use it in iov_send() and iov_recv()
  rename qemu_sendv to iov_send, change proto and move declarations to iov.h
  change qemu_iovec_to_buf() to match other to,from_buf functions
  consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent
  allow qemu_iovec_from_buffer() to specify offset from which to start copying
  consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset()
  rewrite iov_* functions
  change iov_* function prototypes to be more appropriate
  virtio-serial-bus: use correct lengths in control_out() message

Conflicts:
	tests/Makefile

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-07-09 12:35:06 -05:00
Stefan Hajnoczi b35278f754 qcow2: fix #ifdef'd qcow2_check_refcounts() callers
The DEBUG_ALLOC qcow2.h macro enables additional consistency checks
throughout the code.  This makes it easier to spot corruptions that are
introduced during development.  Since consistency check is an expensive
operation the DEBUG_ALLOC macro is used to compile checks out in normal
builds and qcow2_check_refcounts() calls missed the addition of a new
function argument.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-07-09 15:53:01 +02:00
Stefan Hajnoczi af7b708db2 qcow2: fix autoclear image header update
The autoclear feature bits can be used for qcow2 file format features
that are safe to "drop" by old programs that do not understand the
feature.  Upon opening the image file unknown autoclear feature bits are
cleared and the image file header is rewritten, but this was happening
too early in the code when critical header fields were not yet loaded.

Process autoclear feature bits after all necessary header information
has been loaded.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15 14:03:43 +02:00
Paolo Bonzini 6af4e9ead4 qcow2: always operate caches in writeback mode
Writethrough does not need special-casing anymore in the qcow2 caches.
The block layer adds flushes after every guest-initiated data write,
and these will also flush the qcow2 caches to the OS.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15 14:03:43 +02:00
Kevin Wolf 166acf546f qcow2: Support for fixing refcount inconsistencies
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15 14:03:42 +02:00
Kevin Wolf 4534ff5426 qemu-img check -r for repairing images
The QED block driver already provides the functionality to not only
detect inconsistencies in images, but also fix them. However, this
functionality cannot be manually invoked with qemu-img, but the
check happens only automatically during bdrv_open().

This adds a -r switch to qemu-img check that allows manual invocation
of an image repair.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-06-15 14:03:42 +02:00
Michael Tokarev d5e6b1619c change qemu_iovec_to_buf() to match other to,from_buf functions
It now allows specifying offset within qiov to start from and
amount of bytes to copy.  Actual implementation is just a call
to iov_to_buf().

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11 23:12:11 +04:00
Michael Tokarev 1b093c480a consolidate qemu_iovec_copy() and qemu_iovec_concat() and make them consistent
qemu_iovec_concat() is currently a wrapper for
qemu_iovec_copy(), use the former (with extra
"0" arg) in a few places where it is used.

Change skip argument of qemu_iovec_copy() from
uint64_t to size_t, since size of qiov itself
is size_t, so there's no way to skip larger
sizes.  Rename it to soffset, to make it clear
that the offset is applied to src.

Also change the only usage of uint64_t in
hw/9pfs/virtio-9p.c, in v9fs_init_qiov_from_pdu() -
all callers of it actually uses size_t too,
not uint64_t.

One added restriction: as for all other iovec-related
functions, soffset must point inside src.

Order of argumens is already good:
 qemu_iovec_memset(QEMUIOVector *qiov, size_t offset,
                   int c, size_t bytes)
vs:
 qemu_iovec_concat(QEMUIOVector *dst,
                   QEMUIOVector *src,
                   size_t soffset, size_t sbytes)
(note soffset is after _src_ not dst, since it applies to src;
for memset it applies to qiov).

Note that in many places where this function is used,
the previous call is qemu_iovec_reset(), which means
many callers actually want copy (replacing dst content),
not concat.  So we may want to add a wrapper like
qemu_iovec_copy() with the same arguments but which
calls qemu_iovec_reset() before _concat().

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11 23:12:11 +04:00
Michael Tokarev 03396148bc allow qemu_iovec_from_buffer() to specify offset from which to start copying
Similar to
 qemu_iovec_memset(QEMUIOVector *qiov, size_t offset,
                   int c, size_t bytes);
the new prototype is:
 qemu_iovec_from_buf(QEMUIOVector *qiov, size_t offset,
                     const void *buf, size_t bytes);

The processing starts at offset bytes within qiov.

This way, we may copy a bounce buffer directly to
a middle of qiov.

This is exactly the same function as iov_from_buf() from
iov.c, so use the existing implementation and rename it
to qemu_iovec_from_buf() to be shorter and to match the
utility function.

As with utility implementation, we now assert that the
offset is inside actual iovec.  Nothing changed for
current callers, because `offset' parameter is new.

While at it, stop using "bounce-qiov" in block/qcow2.c
and copy decrypted data directly from cluster_data
instead of recreating a temp qiov for doing that.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11 23:12:11 +04:00
Michael Tokarev 3d9b49254f consolidate qemu_iovec_memset{,_skip}() into single function and use existing iov_memset()
This patch combines two functions into one, and replaces
the implementation with already existing iov_memset() from
iov.c.

The new prototype of qemu_iovec_memset():
  size_t qemu_iovec_memset(qiov, size_t offset, int fillc, size_t bytes)
It is different from former qemu_iovec_memset_skip(), and
I want to make other functions to be consistent with it
too: first how much to skip, second what, and 3rd how many
of it.  It also returns actual number of bytes filled in,
which may be less than the requested `bytes' if qiov is
smaller than offset+bytes, in the same way iov_memset()
does.

While at it, use utility function iov_memset() from
iov.h in posix-aio-compat.c, where qiov was used.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2012-06-11 23:07:44 +04:00
Jim Meyering b6c147622d qcow2: don't leak buffer for unexpected qcow_version in header
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-25 18:12:54 +02:00
Kevin Wolf c44bfe4637 qcow2: Don't ignore failure to clear autoclear flags
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-14 17:02:19 +02:00
Paolo Bonzini 5f3777945d block: push bdrv_change_backing_file error checking up from drivers
This check applies to all drivers, but QED lacks it.

Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-10 10:32:11 +02:00
Zhi Yong Wu 15552c4ad3 qcow2: lock on prealloc
preallocate() will be locked. This is required because
qcow2_alloc_cluster_link_l2() assumes that it runs under a lock that it
can drop while COW is being performed.

Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-07 19:33:18 +02:00
Stefan Weil b9531b6eed block/qcow2: Add missing GCC_FMT_ATTR to function report_unsupported()
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-05-02 18:39:39 +02:00
Kevin Wolf 621f058940 qcow2: Zero write support
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:30 +02:00
Kevin Wolf cfcc4c62ff qcow2: Support for feature table header extension
Instead of printing an ugly bitmask, qemu can now print a more helpful
string even for yet unknown features.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:29 +02:00
Kevin Wolf 6377af48b0 qcow2: Support reading zero clusters
This adds support for reading zero clusters in version 3 images.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:29 +02:00
Kevin Wolf 6744cbab8c qcow2: Version 3 images
This adds the basic infrastructure to qcow2 to handle version 3 images.
It includes code to create v3 images, allow header updates for v3 images
and checks feature bits.

It still misses support for zero clusters, so this is not a fully
compliant implementation of v3 yet.

The default for creating new images stays at v2 for now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:29 +02:00
Kevin Wolf 68d000a390 qcow2: Ignore reserved bits in get_cluster_offset
With this change, reading from a qcow2 image ignores all reserved bits
that are set in an L1 or L2 table entry.

Now get_cluster_offset() assigns *cluster_offset only the offset without
any other flags. The cluster type is not longer encoded in the offset,
but a positive return value in case of success.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-20 15:57:27 +02:00
Paolo Bonzini 29cdb2513c block: push recursive flushing up from drivers
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-04-05 14:54:39 +02:00
Kevin Wolf 259b217310 qcow2: Add error messages in qcow2_truncate
qemu-img resize has some limitations with qcow2, but the user is only
told that "this image format does not support resize". Quite confusing,
so add some more detailed error_report() calls and change "this image
format" into "this image".

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-03-12 15:14:06 +01:00
Kevin Wolf 3cce16f44d qcow2: Add some tracing
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-03-12 15:14:06 +01:00
Kevin Wolf 64ca6aee4f qcow2: Reject too large header extensions
Image files that make qemu-img info read several gigabytes into the
unknown header extensions list are bad. Just fail opening the image
if an extension claims to be larger than the header extension area.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-02-29 12:48:47 +01:00
Kevin Wolf fd29b4bbef qcow2: Fix offset in qcow2_read_extensions
The spec says that the length of extensions is padded to 8 bytes, not
the offset. Currently this is the same because the header size is a
multiple of 8, so this is only about compatibility with future changes
to the header size.

While touching it, move the calculation to a common place instead of
duplicating it for each header extension type.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2012-02-29 12:48:47 +01:00
Kevin Wolf 423477e556 qcow2: Fix build with DEBUG_EXT enabled
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-29 12:48:47 +01:00
Kevin Wolf 75bab85ca0 qcow2: Keep unknown header extension when rewriting header
If we want header extensions to work as compatible extensions, we can't
destroy yet unknown header extensions when rewriting the header (e.g.
for changing the backing file). Save all unknown header extensions in a
list of blobs and include them in a new header.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-09 16:17:51 +01:00
Kevin Wolf e24e49e619 qcow2: Update whole header at once
In order to switch the backing file, qcow2 issues multiple write
requests that only changed a part of the image header. Any failure after
the first one would leave the header in an corrupted state. With this
patch, the whole header is written at once, so we can't fail in the
middle.

At the same time, this gives us a reusable functions that updates all
fields of the qcow2 header and not only the backing file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-02-09 16:17:51 +01:00
Li Zhi Hui 28c1202ba6 block/qcow2.c: call qcow2_free_snapshots in the function of qcow2_close
Signed-off-by: Li Zhi Hui <zhihuili@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-15 12:40:08 +01:00
Anthony Liguori eb5d5beaeb Merge remote-tracking branch 'kwolf/for-anthony' into staging 2011-12-05 09:39:25 -06:00
Stefan Hajnoczi e8ee5e4c47 coroutine: add qemu_co_queue_restart_all()
It's common to wake up all waiting coroutines.  Introduce the
qemu_co_queue_restart_all() function to do this instead of looping over
qemu_co_queue_next() in every caller.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05 14:51:38 +01:00
Stefan Hajnoczi f8a2e5e3ca block: convert qcow2, qcow2, and vmdk to .bdrv_co_is_allocated()
The qcow2, qcow, and vmdk block drivers are based on coroutines.  They have a
coroutine mutex which protects internal state.  We can convert the
.bdrv_is_allocated() function to .bdrv_co_is_allocated() by holding the mutex
around the cluster lookup operation.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05 14:51:37 +01:00
Kevin Wolf 42deb29fed qcow2: Return real error code in qcow2_read_snapshots
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-05 14:51:35 +01:00
Dong Xu Wang a968168c58 block: Add coroutine_fn marker to coroutine functions
Looks better when reviewing these source files.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05 14:51:35 +01:00
Dong Xu Wang 9b2260cbd5 fix spelling in block sub directory
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-02 10:50:57 +00:00
Anthony Liguori 06d9260ffa qcow2: implement bdrv_invalidate_cache (v2)
We don't reopen the actual file, but instead invoke the close and open routines.
We specifically ignore the backing file since it's contents are read-only and
therefore immutable.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-11-21 14:58:48 -06:00