qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Max Reitz	2aabe7c7a1	qcow2: Use unsigned addend for update_refcount() update_refcount() and qcow2_update_cluster_refcount() currently take a signed addend. At least one caller passes a value directly derived from an absolute refcount that should be reached ("l2_refcount - 1" in expand_zero_clusters_in_l1()). Therefore, the addend should be unsigned as well; this will be especially important for 64 bit refcounts. Because update_refcount() then no longer knows whether the refcount should be increased or decreased, it now requires an additional flag which specified exactly that. The same applies to qcow2_update_cluster_refcount(). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	7324c10f96	qcow2: Only return status from qcow2_get_refcount Refcounts can theoretically be of type uint64_t; in order to be able to represent the full range, qcow2_get_refcount() cannot use a single variable to represent both all refcount values and also keep some values reserved for errors. One solution would be to add an Error pointer parameter to qcow2_get_refcount(); however, no caller could (currently) pass that error message, so it would have to be emitted immediately and be passed to the next caller by returning -EIO or something similar. Therefore, an Error parameter does not offer any advantages here. The solution applied by this patch is simpler to use. Because no caller would be able to pass the error message, they would have to print it and free it, whereas with this patch the caller only needs to pass the returned integer (which is often a no-op from the code perspective, because that integer will be stored in a variable "ret" which will be returned by the fail path of many callers). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-03-10 14:02:21 +01:00
Max Reitz	8dd93d9339	qcow2: Add two more unalignment checks This adds checks for unaligned L2 table offsets and unaligned data cluster offsets (actually the preallocated offsets for zero clusters) to the zero cluster expansion function. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-01-23 18:17:05 +01:00
Max Reitz	11c89769dc	qcow2: Prevent numerical overflow In qcow2_alloc_cluster_offset(), num is limited to INT_MAX >> BDRV_SECTOR_BITS by all callers. However, since remaining is of type uint64_t, we might as well cast num to that type before performing the shift. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-12-10 10:31:20 +01:00
Max Reitz	ecf58777c5	block/qcow2: Simplify shared L2 handling in amend Currently, we have a bitmap for keeping track of which clusters have been created during the zero cluster expansion process. This was necessary because we need to properly increase the refcount for shared L2 tables. However, now we can simply take the L2 refcount and use it for the cluster allocated for expansion. This will be the correct refcount and therefore we don't have to remember that cluster having been allocated any more. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Benoît Canet <benoit.canet@nodalink.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Message-id: 1414404776-4919-7-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-11-03 11:41:49 +00:00
Max Reitz	4057a2b24a	block/qcow2: Implement status CB for amend The only really time-consuming operation potentially performed by qcow2_amend_options() is zero cluster expansion when downgrading qcow2 images from compat=1.1 to compat=0.10, so report status of that operation and that operation only through the status CB. For this, approximate the progress as the number of L1 entries visited during the operation. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Benoît Canet <benoit.canet@nodalink.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Message-id: 1414404776-4919-5-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-11-03 11:41:49 +00:00
Max Reitz	808c4b6f30	qcow2: Allow "full" discard Normally, discarded sectors should read back as zero. However, there are cases in which a sector (or rather cluster) should be discarded as if they were never written in the first place, that is, reading them should fall through to the backing file again. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Message-id: 1414159063-25977-2-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-11-03 11:41:47 +00:00
Max Reitz	a1391444fe	qcow2: Do not overflow when writing an L1 sector While writing an L1 table sector, qcow2_write_l1_entry() copies the respective range from s->l1_table to the local "buf" array. The size of s->l1_table does not have to be a multiple of L1_ENTRIES_PER_SECTOR; thus, limit the index which is used for copying all entries to the L1 size. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Peter Lieven <pl@kamp.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-10-23 15:34:02 +02:00
Max Reitz	a97c67ee6c	qcow2: Check L1/L2/reftable entries for alignment Offsets taken from the L1, L2 and refcount tables are generally assumed to be correctly aligned. However, this cannot be guaranteed if the image has been written to by something different than qemu, thus check all offsets taken from these tables for correct cluster alignment. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 1409926039-29044-5-git-send-email-mreitz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-09-22 11:39:28 +01:00
Markus Armbruster	5839e53bbc	block: Use g_new() & friends where that makes obvious sense g_new(T, n) is neater than g_malloc(sizeof(T) * n). It's also safer, for two reasons. One, it catches multiplication overflowing size_t. Two, it returns T * rather than void , which lets the compiler catch more type errors. Patch created with Coccinelle, with two manual changes on top: Add const to bdrv_iterate_format() to keep the types straight * Convert the allocation in bdrv_drop_intermediate(), which Coccinelle inexplicably misses Coccinelle semantic patch: @@ type T; @@ -g_malloc(sizeof(T)) +g_new(T, 1) @@ type T; @@ -g_try_malloc(sizeof(T)) +g_try_new(T, 1) @@ type T; @@ -g_malloc0(sizeof(T)) +g_new0(T, 1) @@ type T; @@ -g_try_malloc0(sizeof(T)) +g_try_new0(T, 1) @@ type T; expression n; @@ -g_malloc(sizeof(T) * (n)) +g_new(T, n) @@ type T; expression n; @@ -g_try_malloc(sizeof(T) * (n)) +g_try_new(T, n) @@ type T; expression n; @@ -g_malloc0(sizeof(T) * (n)) +g_new0(T, n) @@ type T; expression n; @@ -g_try_malloc0(sizeof(T) * (n)) +g_try_new0(T, n) @@ type T; expression p, n; @@ -g_realloc(p, sizeof(T) * (n)) +g_renew(T, p, n) @@ type T; expression p, n; @@ -g_try_realloc(p, sizeof(T) * (n)) +g_try_renew(T, p, n) Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-08-20 11:51:28 +02:00
Max Reitz	ff52aab2df	qcow2: Catch !*host_offset for data allocation qcow2_alloc_cluster_offset() uses host_offset == 0 as "no preferred offset" for the (data) cluster range to be allocated. However, this offset is actually valid and may be allocated on images with a corrupted refcount table or first refcount block. In this case, the corruption prevention should normally catch that write anyway (because it would overwrite the image header). But since 0 is a special value here, the function assumes that nothing has been allocated at all which it asserts against. Because this condition is not qemu's fault but rather that of a broken image, it shouldn't throw an assertion but rather mark the image corrupt and show an appropriate message, which this patch does by calling the corruption check earlier than it would be called normally (before the assertion). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-08-15 15:07:16 +02:00
Kevin Wolf	de82815db1	qcow2: Handle failure for potentially large allocations Some code in the block layer makes potentially huge allocations. Failure is not completely unexpected there, so avoid aborting qemu and handle out-of-memory situations gracefully. This patch addresses the allocations in the qcow2 block driver. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-08-15 15:07:15 +02:00
Kevin Wolf	bd60436936	qcow2: Fix memory leak in COW error path This triggers if bs->drv becomes NULL in a concurrent request. This is currently only the case when corruption prevention kicks in (i.e. at most once per image, and after that it produces I/O errors). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-05-28 14:28:46 +02:00
Max Reitz	b93f995081	qcow2: Check min_size in qcow2_grow_l1_table() First, new_l1_size is an int64_t, whereas min_size is a uint64_t. Therefore, during the loop which adjusts new_l1_size until it equals or exceeds min_size, new_l1_size might overflow and become negative. The comparison in the loop condition however will take it as an unsigned value (because min_size is unsigned) and therefore recognize it as exceeding min_size. Therefore, the loop is left with a negative new_l1_size, which is not correct. This could be fixed by making new_l1_size uint64_t. On the other hand, however, by doing this, the while loop may take forever. If min_size is e.g. UINT64_MAX, it will take new_l1_size probably multiple overflows to reach the exact same value (if it reaches it at all). Then, right after the loop, new_l1_size will be recognized as being too big anyway. Both problems require a ridiculously high min_size value, which is very unlikely to occur; but both problems are also simply avoided by checking whether min_size is sane before calculating new_l1_size (which should still be checked separately, though). Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-04-30 14:46:17 +02:00
Max Reitz	c883db0df9	qcow2: Fix discard discard_single_l2() should not implement its own version of qcow2_get_cluster_type(), but rather rely on this already existing function. By doing so, it will work for compressed clusters as well (which it did not so far). Also, rename "old_offset" to "old_l2_entry", as both are quite different (and the value is indeed of the latter kind). Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-04-29 16:39:51 +02:00
Kevin Wolf	8885eadedd	qcow2: Put cache reference in error case When qcow2_get_cluster_offset() sees a zero cluster in a version 2 image, it (rightfully) returns an error. But in doing so it shouldn't leak an L2 table cache reference. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2014-04-04 17:10:08 +02:00
Kevin Wolf	6b7d4c5558	qcow2: Fix copy_sectors() with VM state bs->total_sectors is not the highest possible sector number that could be involved in a copy on write operation: VM state is after the end of the virtual disk. This resulted in wrong values for the number of sectors to be copied (n). The code that checks for the end of the image isn't required any more because the code hasn't been calling the block layer's bdrv_read() for a long time; instead, it directly calls qcow2_readv(), which doesn't error out on VM state sector numbers. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-04-01 15:22:35 +02:00
Kevin Wolf	cab60de930	qcow2: Fix new L1 table size check (CVE-2014-0143) The size in bytes is assigned to an int later, so check that instead of the number of entries. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-04-01 15:22:35 +02:00
Max Reitz	dba2855572	qcow2: Check bs->drv in copy_sectors() Before dereferencing bs->drv for a call to its member bdrv_co_readv(), copy_sectors() should check whether that pointer is indeed valid, since it may have been set to NULL by e.g. a concurrent write triggering the corruption prevention mechanism. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-03-13 14:23:27 +01:00
Kevin Wolf	a71835a0cc	qcow2: Set zero flag for discarded clusters Instead of making the backing file contents visible again after a discard request, set the zero flag if possible (i.e. on version >= 3). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2014-02-21 21:02:21 +01:00
Hu Tao	16f0587e0a	qcow2: remove n_start and n_end of qcow2_alloc_cluster_offset() n_start can be actually calculated from offset. The number of sectors to be allocated(n_end - n_start) can be passed in in num. By removing n_start and n_end, we can save two parameters. The side effect is there is a bug in qcow2.c:preallocate() that passes incorrect n_start to qcow2_alloc_cluster_offset() is fixed. The bug can be triggerred by a larger cluster size than the default value(65536), for example: ./qemu-img create -f qcow2 \ -o 'cluster_size=131072,preallocation=metadata' file.img 4G Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-02-09 09:12:39 +01:00
Hu Tao	ac95acdb8e	qcow2: use start_of_cluster() and offset_into_cluster() everywhere Signed-off-by: Hu Tao <hutao@cn.fujitsu.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-12-06 16:53:50 +01:00
Peter Lieven	aa7bfbfff7	block: add flags to bdrv_*_write_zeroes Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-11-28 10:30:51 +01:00
Peter Lieven	78a52ad5ac	qcow2: fix possible corruption when reading multiple clusters if multiple sectors spanning multiple clusters are read the function count_contiguous_clusters should ensure that the cluster type should not change between the clusters. Especially the for-loop should break when we have one or more normal clusters followed by a compressed cluster. Unfortunately the wrong macro was used in the mask to compare the flags. This was discovered while debugging a data corruption issue when converting a compressed qcow2 image to raw. qemu-img reads 2MB chunks which span multiple clusters. CC: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-11-14 13:09:07 +01:00
Peter Maydell	e4ef9f465c	bswap.h: Remove cpu_to_be64wu() Replace the legacy cpu_to_be64wu() with stq_be_p(). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <rth@twiddle.net> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-id: 1383669517-25598-9-git-send-email-peter.maydell@linaro.org Signed-off-by: Anthony Liguori <aliguori@amazon.com>	2013-11-05 19:57:47 -08:00
Max Reitz	231bb26764	qcow2: Use negated overflow check mask In qcow2_check_metadata_overlap and qcow2_pre_write_overlap_check, change the parameter signifying the checks to perform from its current positive form to a negative one, i.e., it will no longer explicitly specify every check to perform but rather a mask of checks not to perform. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-10-11 16:50:00 +02:00
Max Reitz	e3b21ef9e0	qcow2: Free allocated L2 cluster on error If an error occurs in l2_allocate, the allocated (but unused) L2 cluster should be freed. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Benoit Canet <benoit@irqsave.net> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-10-07 13:23:19 +02:00
Max Reitz	fda74f826b	qcow2: Switch L1 table in a single sequence Switching the L1 table in memory should be an atomic operation, as far as possible. Calling qcow2_free_clusters on the old L1 table on disk is not a good idea when the old L1 table is no longer valid and the address to the new one hasn't yet been written into the corresponding BDRVQcowState field. To be more specific, this can lead to segfaults due to qcow2_check_metadata_overlap trying to access the L1 table during the free operation. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-10-02 15:38:29 +02:00
Kevin Wolf	61653008ad	qcow2: Remove useless count_contiguous_clusters() parameter All callers pass start = 0, and it's doubtful if any other value would actually do what you expect. Remove the parameter. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com>	2013-09-27 17:22:43 +02:00
Max Reitz	22f0dd29af	qcow2: COMPRESSED on count_contiguous_clusters Compressed clusters can never be contiguous, therefore the corresponding flag does not need to be given explicitly to count_contiguous_clusters. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-09-27 17:22:43 +02:00
Max Reitz	15684a4742	qcow2: count_contiguous_clusters and compression The function is not intended to be used on compressed clusters and will not work correctly, if used anyway, since L2E_OFFSET_MASK is not the right mask for determining the offset of compressed clusters. Therefore, assert that the first cluster is not compressed and always include the compression flag in the mask of significant flags, i.e., stop the search as soon as a compressed cluster occurs. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-09-27 17:22:43 +02:00
Max Reitz	320c706666	qcow2: Free only newly allocated clusters on error In expand_zero_clusters_in_l1, a new cluster is only allocated if it was not already preallocated. On error, such preallocated clusters should not be freed, but only the newly allocated ones. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-09-27 17:22:43 +02:00
Max Reitz	be0b742ee3	qcow2: Always use error path in l2_allocate Just returning -errno in some cases prevents trace_qcow2_l2_allocate_done from being executed (and, in one case, also the unused allocated L2 table from being freed). Always going down the error path fixes this. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-09-27 17:22:43 +02:00
Max Reitz	8585afd813	qcow2: Don't put invalid L2 table into cache In l2_allocate, the fail path is executed if qcow2_cache_flush fails. However, the L2 table has not yet been fetched from the L2 table cache. The qcow2_cache_put in the fail path therefore basically gives an undefined argument as the L2 table address (in this case). Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-09-27 11:31:59 +02:00
Max Reitz	e390cf5a97	qcow2: Correct bitmap size in zero expansion Since the expanded_clusters bitmap is addressed using host offsets in the underlying image file, the correct size to use for allocating the bitmap is not determined by the guest disk image but by the underlying host image file. Furthermore, this size may change during the expansion due to cluster allocations on growable image files. In this case, the bitmap needs to be resized as well to reflect the growth. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-09-27 11:16:35 +02:00
Max Reitz	c01dbccbad	qcow2: Assert against currently impossible overflow If qcow2_alloc_cluster_link_l2 is called with a QCowL2Meta describing a request crossing L2 boundaries, a buffer overflow will occur. This is impossible right now since such requests are never generated (every request is shortened to L2 boundaries before) and probably also completely unintended (considering the name "QCowL2Meta"), however, it is still worth an assertion. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-09-25 21:57:44 +02:00
Max Reitz	32b6444d23	qcow2-cluster: Expand zero clusters Add functionality for expanding zero clusters. This is necessary for downgrading the image version to one without zero cluster support. For non-backed images, this function may also just discard zero clusters instead of truly expanding them. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-09-12 10:12:46 +02:00
Kevin Wolf	670df5e3b4	qcow2: Pass discard type to qcow2_discard_clusters() The function will be used internally instead of only being called for guest discard requests. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com>	2013-09-12 10:12:46 +02:00
Max Reitz	e23e400ec6	qcow2-refcount: Repair OFLAG_COPIED errors Since the OFLAG_COPIED checks are now executed after the refcounts have been repaired (if repairing), it is safe to assume that they are correct but the OFLAG_COPIED flag may be not. Therefore, if its value differs from what it should be (considering the according refcount), that discrepancy can be repaired by correctly setting (or clearing that flag. Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-08-30 15:48:44 +02:00
Max Reitz	cf93980e77	qcow2: Employ metadata overlap checks The pre-write overlap check function is now called before most of the qcow2 writes (aborting it on collision or other error). Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-08-30 15:48:43 +02:00
Kevin Wolf	0b919fae31	qcow2: Batch discards This optimises the discard operation for freed clusters by batching discard requests (both snapshot deletion and bdrv_discard end up updating the refcounts cluster by cluster). Note that we don't discard asynchronously, but keep s->lock held. This is to avoid that a freed cluster is reallocated and written to while the discard is still in flight. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-06-24 10:25:17 +02:00
Kevin Wolf	6cfcb9b8b9	qcow2: Add refcount update reason to all callers This adds a refcount update reason to all callers of update_refcounts(), so that a follow-up patch can use this information to decide whether clusters that reach a refcount of 0 should be discarded in the image file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-06-24 10:25:17 +02:00
Kevin Wolf	2cf7cfa1cd	qcow2: Catch some L1 table index overflows This catches the situation that is described in the bug report at https://bugs.launchpad.net/qemu/+bug/865518 and goes like this: $ qemu-img create -f qcow2 huge.qcow2 $((10241024))T Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off $ qemu-io /tmp/huge.qcow2 -c "write $((102410241024102410241024 - 1024)) 512" Segmentation fault With this patch applied the segfault will be avoided, however the case will still fail, though gracefully: $ qemu-img create -f qcow2 /tmp/huge.qcow2 $((10241024))T Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off qemu-img: The image size is too large for file format 'qcow2' Note that even long before these overflow checks kick in, you get insanely high memory usage (up to INT_MAX sizeof(uint64_t) = 16 GB for the L1 table), so with somewhat smaller image sizes you'll probably see qemu aborting for a failed g_malloc(). If you need huge image sizes, you should increase the cluster size to the maximum of 2 MB in order to get higher limits. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-05-14 16:44:33 +02:00
Kevin Wolf	ecdd5333ab	qcow2: Gather clusters in a looping loop Instead of just checking once in exactly this order if there are dependendies, non-COW clusters and new allocation, this starts looping around these. This way we can, for example, gather non-COW clusters after new allocations as long as the host cluster offsets stay contiguous. Once handle_dependencies() is extended so that COW areas of in-flight allocations can be overwritten, this allows to continue with gathering other clusters (we wouldn't be able to do that without this change because we would have missed a possible second dependency in one of the next clusters). This means that in the typical sequential write case, we can combine the COW overwrite of one cluster with the allocation of the next cluster as soon as something like Delayed COW gets actually implemented. It is only by avoiding splitting requests this way that Delayed COW actually starts improving performance noticably. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:44 +01:00
Kevin Wolf	2c3b32d256	qcow2: Move cluster gathering to a non-looping loop This patch is mainly to separate the indentation change from the semantic changes. All that really changes here is that everything moves into a while loop, all 'goto done' become 'break' and at the end of the loop a new 'break is inserted. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:44 +01:00
Kevin Wolf	88c6588c51	qcow2: Allow requests with multiple l2metas Instead of expecting a single l2meta, have a list of them. This allows to still have a single I/O request for the guest data, even though multiple l2meta may be needed in order to describe both a COW overwrite and a new cluster allocation (typical sequential write case). Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:44 +01:00
Kevin Wolf	710c2496d8	qcow2: Use byte granularity in qcow2_alloc_cluster_offset() This gets rid of the nb_clusters and keep_clusters and the associated complicated calculations. Just advance the number of bytes that have been processed and everything is fine. This patch advances the variables even after the last operation even though they aren't used any more afterwards to make things look more uniform. A later patch will turn the whole thing into a loop and then it actually starts making sense. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:44 +01:00
Kevin Wolf	411d62b04b	qcow2: Prepare handle_alloc/copied() for byte granularity This makes handle_alloc() and handle_copied() return byte-granularity host offsets instead of returning always the cluster start. This is required so that qcow2_alloc_cluster_offset() can stop aligning everything to cluster boundaries. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	e62daaf679	qcow2: handle_copied(): Implement non-zero host_offset Look only for clusters that start at a given physical offset. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00
Kevin Wolf	c53ede9f6d	qcow2: handle_copied(): Get rid of keep_clusters parameter Now *bytes is used to return the length of the area that can be written to without performing an allocation or COW. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-03-28 11:52:43 +01:00

1 2 3

146 Commits