qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Alberto Garcia	3441ad4bc4	qcow2: Document and enforce the QCowL2Meta invariants The QCowL2Meta structure is used to store information about a part of a write request that touches clusters that need changes in their L2 entries. This happens with newly-allocated clusters or subclusters. This structure has changed a bit since it was first created and its current documentation is not quite up-to-date. A write request can span a region consisting of a combination of clusters of different types, and qcow2_alloc_host_offset() can repeatedly call handle_copied() and handle_alloc() to add more clusters to the mix as long as they all are contiguous on the image file. Because of this a write request has a list of QCowL2Meta structures, one for each part of the request that needs changes in the L2 metadata. Each one of them spans nb_clusters and has two copy-on-write regions located immediately before and after the middle region touched by that part of the write request. Even when those regions themselves are empty their offsets must be correct because they are used to know the location of the middle region. This was not always the case but it is not a problem anymore because the only two places where QCowL2Meta structures are created (calculate_l2_meta() and qcow2_co_truncate()) ensure that the copy-on-write regions are correctly defined, and so do assertions like the ones in perform_cow(). The conditional initialization of the 'written_to' variable is therefore unnecessary and is removed by this patch. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20201007161323.4667-1-berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-11-09 15:44:21 +01:00
Alberto Garcia	46cd1e8a47	qcow2: Skip copy-on-write when allocating a zero cluster Since commit `c8bb23cbdb` when a write request results in a new allocation QEMU first tries to see if the rest of the cluster outside the written area contains only zeroes. In that case, instead of doing a normal copy-on-write operation and writing explicit zero buffers to disk, the code zeroes the whole cluster efficiently using pwrite_zeroes() with BDRV_REQ_NO_FALLBACK. This improves performance very significantly but it only happens when we are writing to an area that was completely unallocated before. Zero clusters (QCOW2_CLUSTER_ZERO_*) are treated like normal clusters and are therefore slower to allocate. This happens because the code uses bdrv_is_allocated_above() rather bdrv_block_status_above(). The former is not as accurate for this purpose but it is faster. However in the case of qcow2 the underlying call does already report zero clusters just fine so there is no reason why we cannot use that information. After testing 4KB writes on an image that only contains zero clusters this patch results in almost five times more IOPS. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <6d77cab968c501c44d6e1089b9bc91b04170b49e.1603731354.git.berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-10-27 15:26:20 +01:00
Vladimir Sementsov-Ogievskiy	67c095c8b8	block/io: fix bdrv_co_block_status_above bdrv_co_block_status_above has several design problems with handling short backing files: 1. With want_zeros=true, it may return ret with BDRV_BLOCK_ZERO but without BDRV_BLOCK_ALLOCATED flag, when actually short backing file which produces these after-EOF zeros is inside requested backing sequence. 2. With want_zero=false, it may return pnum=0 prior to actual EOF, because of EOF of short backing file. Fix these things, making logic about short backing files clearer. With fixed bdrv_block_status_above we also have to improve is_zero in qcow2 code, otherwise iotest 154 will fail, because with this patch we stop to merge zeros of different types (produced by fully unallocated in the whole backing chain regions vs produced by short backing files). Note also, that this patch leaves for another day the general problem around block-status: misuse of BDRV_BLOCK_ALLOCATED as is-fs-allocated vs go-to-backing. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20200924194003.22080-2-vsementsov@virtuozzo.com [Fix s/comes/come/ as suggested by Eric Blake --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2020-10-23 13:42:16 +01:00
Alberto Garcia	bfd0989acf	qcow2: Convert qcow2_alloc_cluster_offset() into qcow2_alloc_host_offset() qcow2_alloc_cluster_offset() takes an (unaligned) guest offset and returns the (aligned) offset of the corresponding cluster in the qcow2 image. In practice none of the callers need to know where the cluster starts so this patch makes the function calculate and return the final host offset directly. The function is also renamed accordingly. See `388e581615` for a similar change to qcow2_get_cluster_offset(). Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <9bfef50ec9200d752413be4fc2aeb22a28378817.1599833007.git.berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-15 11:31:10 +02:00
Alberto Garcia	8e958260c5	qcow2: Make preallocate_co() resize the image to the correct size This function preallocates metadata structures and then extends the image to its new size, but that new size calculation is wrong because it doesn't take into account that the host_offset variable is always cluster-aligned. This problem can be reproduced with preallocation=metadata when the original size is not cluster-aligned but the new size is. In this case the final image size will be shorter than expected. qemu-img create -f qcow2 img.qcow2 31k qemu-img resize --preallocation=metadata img.qcow2 128k Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <adeb8b059917b141d5f5b3bd2a016262d3052c79.1599833007.git.berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> [mreitz: Mark compat=0.10 unsupported for iotest 125] Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-15 11:30:36 +02:00
Alberto Garcia	580384d637	qcow2: Return the original error code in qcow2_co_pwrite_zeroes() This function checks the current status of a (sub)cluster in order to see if an unaligned 'write zeroes' request can be done efficiently by simply updating the L2 metadata and without having to write actual zeroes to disk. If the situation does not allow using the fast path then the function returns -ENOTSUP and the caller falls back to writing zeroes. If can happen however that the aforementioned check returns an actual error code so in this case we should pass it to the caller. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <20200909123739.719-1-berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-15 11:05:13 +02:00
Alberto Garcia	1a52b73dba	qcow2: Handle QCowL2Meta on error in preallocate_co() If qcow2_alloc_cluster_offset() or qcow2_alloc_cluster_link_l2() fail then this function simply returns the error code, potentially leaking the QCowL2Meta structure and leaving stale items in s->cluster_allocs. A second problem is that this function calls qcow2_free_any_clusters() on failure but passing a host cluster offset instead of an L2 entry. Luckily for normal uncompressed clusters a raw offset also works like a valid L2 entry so it works just the same, but we should be using qcow2_free_clusters() instead. This patch fixes both problems by using qcow2_handle_l2meta(). Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <cd3a6b9abd43f9c0b60be413d760f0cacc67eb66.1599573989.git.berto@igalia.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-15 11:05:13 +02:00
Alberto Garcia	f7bd5bba1b	qcow2: Don't check nb_clusters when removing l2meta from the list In the past, when a new cluster was allocated the l2meta structure was a variable in the stack so it was necessary to have a way to tell whether it had been initialized and contained valid data or not. The nb_clusters field was used for this purpose. Since commit `f50f88b9fe` this is no longer the case, l2meta (nowadays a pointer to a list) is only allocated when needed and nb_clusters is guaranteed to be > 0 so this check is unnecessary. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <ab0b67c29c7ba26e598db35f12aa5ab5982539c1.1599150873.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-15 11:05:13 +02:00
Alberto Garcia	02b1ecfa10	qcow2: Use macros for the L1, refcount and bitmap table entry sizes This patch replaces instances of sizeof(uint64_t) in the qcow2 driver with macros that indicate what those sizes are actually referring to. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <20200828110828.13833-1-berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-09-15 11:05:12 +02:00
Alberto Garcia	2118771ddf	qcow2: Allow preallocation and backing files if extended_l2 is set Traditional qcow2 images don't allow preallocation if a backing file is set. This is because once a cluster is allocated there is no way to tell that its data should be read from the backing file. Extended L2 entries have individual allocation bits for each subcluster, and therefore it is perfectly possible to have an allocated cluster with all its subclusters unallocated. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <6d5b0f38e7dc5f2f31d8cab1cb92044e9909aece.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 09:20:04 +02:00
Alberto Garcia	7be2025258	qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit Now that the implementation of subclusters is complete we can finally add the necessary options to create and read images with this feature, which we call "extended L2 entries". Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <6476caaa73216bd05b7bb2d504a20415e1665176.1594396418.git.berto@igalia.com> [mreitz: %s/5\.1/5.2/; fixed 302's and 303's reference output] Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 09:19:55 +02:00
Alberto Garcia	40dee94320	qcow2: Add prealloc field to QCowL2Meta This field allows us to indicate that the L2 metadata update does not come from a write request with actual data but from a preallocation request. For traditional images this does not make any difference, but for images with extended L2 entries this means that the clusters are allocated normally in the L2 table but individual subclusters are marked as unallocated. This will allow preallocating images that have a backing file. There is one special case: when we resize an existing image we can also request that the new clusters are preallocated. If the image already had a backing file then we have to hide any possible stale data and zero out the new clusters (see commit `955c7d6687` for more details). In this case the subclusters cannot be left as unallocated so the L2 bitmap must be updated. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <960d4c444a4f5a870e2b47e5da322a73cd9a2f5a.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Alberto Garcia	0dd07b298f	qcow2: Add subcluster support to qcow2_measure() Extended L2 entries are bigger than normal L2 entries so this has an impact on the amount of metadata needed for a qcow2 file. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <7efae2efd5e36b42d2570743a12576d68ce53685.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Alberto Garcia	a6841a2de6	qcow2: Add subcluster support to qcow2_co_pwrite_zeroes() This works now at the subcluster level and pwrite_zeroes_alignment is updated accordingly. qcow2_cluster_zeroize() is turned into qcow2_subcluster_zeroize() with the following changes: - The request can now be subcluster-aligned. - The cluster-aligned body of the request is still zeroized using zero_in_l2_slice() as before. - The subcluster-aligned head and tail of the request are zeroized with the new zero_l2_subclusters() function. There is just one thing to take into account for a possible future improvement: compressed clusters cannot be partially zeroized so zero_l2_subclusters() on the head or the tail can return -ENOTSUP. This makes the caller repeat the complete request and write actual zeroes to disk. This is sub-optimal because 1) if the head area was compressed we would still be able to use the fast path for the body and possibly the tail. 2) if the tail area was compressed we are writing zeroes to the head and the body areas, which are already zeroized. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <17e05e2ee7e12f10dcf012da81e83ebe27eb3bef.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Alberto Garcia	bf4a66eed4	qcow2: Add subcluster support to handle_alloc_space() The bdrv_co_pwrite_zeroes() call here fills complete clusters with zeroes, but it can happen that some subclusters are not part of the write request or the copy-on-write. This patch makes sure that only the affected subclusters are overwritten. A potential improvement would be to also fill with zeroes the other subclusters if we can guarantee that we are not overwriting existing data. However this would waste more disk space, so we should first evaluate if it's really worth doing. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <b3dc97e8e2240ddb5191a4f930e8fc9653f94621.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Alberto Garcia	97490a143e	qcow2: Handle QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC When dealing with subcluster types there is a new value called QCOW2_SUBCLUSTER_UNALLOCATED_ALLOC that has no equivalent in QCow2ClusterType. This patch handles that value in all places where subcluster types are processed. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <bf09e2e2439a468a901bb96ace411eed9ee50295.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Alberto Garcia	10dabdc596	qcow2: Replace QCOW2_CLUSTER_* with QCOW2_SUBCLUSTER_* In order to support extended L2 entries some functions of the qcow2 driver need to start dealing with subclusters instead of clusters. qcow2_get_host_offset() is modified to return the subcluster type instead of the cluster type, and all callers are updated to replace all values of QCow2ClusterType with their QCow2SubclusterType equivalents. This patch only changes the data types, there are no semantic changes. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <f6c29737c295f32cbee74c903c30b01820363b34.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Alberto Garcia	ca4a0bb81b	qcow2: Add cluster type parameter to qcow2_get_host_offset() This function returns an integer that can be either an error code or a cluster type (a value from the QCow2ClusterType enum). We are going to start using subcluster types instead of cluster types in some functions so it's better to use the exact data types instead of integers for clarity and in order to detect errors more easily. This patch makes qcow2_get_host_offset() return 0 on success and puts the returned cluster type in a separate parameter. There are no semantic changes. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <396b6eab1859a271551dcd7dcba77f8934aa3c3f.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Alberto Garcia	c8fd8554d9	qcow2: Add l2_entry_size() qcow2 images with subclusters have 128-bit L2 entries. The first 64 bits contain the same information as traditional images and the last 64 bits form a bitmap with the status of each individual subcluster. Because of that we cannot assume that L2 entries are sizeof(uint64_t) anymore. This function returns the proper value for the image. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <d34d578bd0380e739e2dde3e8dd6187d3d249fa9.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Alberto Garcia	d0346b5591	qcow2: Add subcluster-related fields to BDRVQcow2State This patch adds the following new fields to BDRVQcow2State: - subclusters_per_cluster: Number of subclusters in a cluster - subcluster_size: The size of each subcluster, in bytes - subcluster_bits: No. of bits so 1 << subcluster_bits = subcluster_size Images without subclusters are treated as if they had exactly one subcluster per cluster (i.e. subcluster_size = cluster_size). Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <55bfeac86b092fa2c9d182a95cbeb479ff7eca4f.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Alberto Garcia	388e581615	qcow2: Convert qcow2_get_cluster_offset() into qcow2_get_host_offset() qcow2_get_cluster_offset() takes an (unaligned) guest offset and returns the (aligned) offset of the corresponding cluster in the qcow2 image. In practice none of the callers need to know where the cluster starts so this patch makes the function calculate and return the final host offset directly. The function is also renamed accordingly. There is a pre-existing exception with compressed clusters: in this case the function returns the complete cluster descriptor (containing the offset and size of the compressed data). This does not change with this patch but it is now documented. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <ffae6cdc5ca8950e8280ac0f696dcc376cb07095.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Alberto Garcia	9c4269d54b	qcow2: Make Qcow2AioTask store the full host offset The file_cluster_offset field of Qcow2AioTask stores a cluster-aligned host offset. In practice this is not very useful because all users() of this structure need the final host offset into the cluster, which they calculate using host_offset = file_cluster_offset + offset_into_cluster(s, offset) There is no reason why Qcow2AioTask cannot store host_offset directly and that is what this patch does. () compressed clusters are the exception: in this case what file_cluster_offset was storing was the full compressed cluster descriptor (offset + size). This does not change with this patch but it is documented now. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <07c4b15c644dcf06c9459f98846ac1c4ea96e26f.1594396418.git.berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-08-25 08:33:20 +02:00
Andrey Shinkevich	8098969cf2	qcow2: Fix capitalization of header extension constant. Make the capitalization of the hexadecimal numbers consistent for the QCOW2 header extension constants in docs/interop/qcow2.txt. Suggested-by: Eric Blake <eblake@redhat.com> Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <1594973699-781898-2-git-send-email-andrey.shinkevich@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2020-07-27 15:39:58 -05:00
Eric Blake	e54ee1b385	block: Add support to warn on backing file change without format For now, this is a mechanical addition; all callers pass false. But the next patch will use it to improve 'qemu-img rebase -u' when selecting a backing file with no format. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Reviewed-by: Ján Tomko <jtomko@redhat.com> Message-Id: <20200706203954.341758-10-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:18:59 +02:00
Eric Blake	bc5ee6da71	qcow2: Deprecate use of qemu-img amend to change backing file The use of 'qemu-img amend' to change qcow2 backing files is not tested very well. In particular, our implementation has a bug where if a new backing file is provided without a format, then the prior format is blindly reused, even if this results in data corruption, but this is not caught by iotests. There are also situations where amending other options needs access to the original backing file (for example, on a downgrade to a v2 image, knowing whether a v3 zero cluster must be allocated or may be left unallocated depends on knowing whether the backing file already reads as zero), but the command line does not have a nice way to tell us both the backing file to use for opening the image as well as the backing file to install after the operation is complete. Even if we do allow changing the backing file, it is redundant with the existing ability to change backing files via 'qemu-img rebase -u'. It is time to deprecate this support (leaving the existing behavior intact, even if it is buggy), and at a point in the future, require the use of only 'qemu-img rebase' for adjusting backing chain relations, saving 'qemu-img amend' for changes unrelated to the backing chain. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200706203954.341758-8-eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-07-14 15:18:59 +02:00
Markus Armbruster	b11a093c60	qapi: Smooth another visitor error checking pattern Convert visit_type_FOO(v, ..., &ptr, &err); ... if (err) { ... } to visit_type_FOO(v, ..., &ptr, errp); ... if (!ptr) { ... } for functions that set @ptr to non-null / null on success / error. Eliminate error_propagate() that are now unnecessary. Delete @err that are now unused. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-40-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	af175e85f9	error: Eliminate error_propagate() with Coccinelle, part 2 When all we do with an Error we receive into a local variable is propagating to somewhere else, we can just as well receive it there right away. The previous commit did that with a Coccinelle script I consider fairly trustworthy. This commit uses the same script with the matching of return taken out, i.e. we convert if (!foo(..., &err)) { ... error_propagate(errp, err); ... } to if (!foo(..., errp)) { ... ... } This is unsound: @err could still be read between afterwards. I don't know how to express "no read of @err without an intervening write" in Coccinelle. Instead, I manually double-checked for uses of @err. Suboptimal line breaks tweaked manually. qdev_realize() simplified further to placate scripts/checkpatch.pl. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200707160613.848843-36-armbru@redhat.com>	2020-07-10 15:18:08 +02:00
Markus Armbruster	235e59cf03	qemu-option: Use returned bool to check for failure The previous commit enables conversion of foo(..., &err); if (err) { ... } to if (!foo(..., &err)) { ... } for QemuOpts functions that now return true / false on success / error. Coccinelle script: @@ identifier fun = { opts_do_parse, parse_option_bool, parse_option_number, parse_option_size, qemu_opt_parse, qemu_opt_rename, qemu_opt_set, qemu_opt_set_bool, qemu_opt_set_number, qemu_opts_absorb_qdict, qemu_opts_do_parse, qemu_opts_from_qdict_entry, qemu_opts_set, qemu_opts_validate }; expression list args, args2; typedef Error; Error *err; @@ - fun(args, &err, args2); - if (err) + if (!fun(args, &err, args2)) { ... } A few line breaks tidied up manually. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-Id: <20200707160613.848843-15-armbru@redhat.com> [Conflict with commit `0b6786a9c1` "block/amend: refactor qcow2 amend options" resolved by rerunning Coccinelle on master's version]	2020-07-10 15:17:35 +02:00
Vladimir Sementsov-Ogievskiy	a2adbbf603	block: drop unallocated_blocks_are_zero Currently this field only set by qed and qcow2. But in fact, all backing-supporting formats (parallels, qcow, qcow2, qed, vmdk) share these semantics: on unallocated blocks, if there is no backing file they just memset the buffer with zeroes. So, document this behavior for .supports_backing and drop .unallocated_blocks_are_zero Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200528094405.145708-10-vsementsov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-07-06 10:34:14 +02:00
Maxim Levitsky	8ea1613d91	block/qcow2: implement blockdev-amend Currently the implementation only supports amending the encryption options, unlike the qemu-img version Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200608094030.670121-14-mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-07-06 08:49:28 +02:00
Maxim Levitsky	90766d9db9	block/qcow2: extend qemu-img amend interface with crypto options Now that we have all the infrastructure in place, wire it in the qcow2 driver and expose this to the user. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200608094030.670121-9-mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-07-06 08:49:28 +02:00
Maxim Levitsky	0b6786a9c1	block/amend: refactor qcow2 amend options Some qcow2 create options can't be used for amend. Remove them from the qcow2 create options and add generic logic to detect such options in qemu-img Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> [mreitz: Dropped some iotests reference output hunks that became unnecessary thanks to "iotests: Make _filter_img_create more active"] Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200625125548.870061-12-mreitz@redhat.com>	2020-07-06 08:49:28 +02:00
Maxim Levitsky	df373fb0a3	block/amend: separate amend and create options for qemu-img Some options are only useful for creation (or hard to be amended, like cluster size for qcow2), while some other options are only useful for amend, like upcoming keyslot management options for luks Since currently only qcow2 supports amend, move all its options to a common macro and then include it in each action option list. In future it might be useful to remove some options which are not supported anyway from amend list, which currently cause an error message if amended. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200608094030.670121-5-mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-07-06 08:49:28 +02:00
Maxim Levitsky	a3579bfa0a	block/amend: add 'force' option 'force' option will be used for some unsafe amend operations. This includes things like erasing last keyslot in luks based formats (which destroys the data, unless the master key is backed up by external means), but that _might_ be desired result. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200608094030.670121-4-mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-07-06 08:49:28 +02:00
Alberto Garcia	a5675f3901	qcow2: Fix preallocation on images with unaligned sizes When resizing an image with qcow2_co_truncate() using the falloc or full preallocation modes the code assumes that both the old and new sizes are cluster-aligned. There are two problems with this: 1) The calculation of how many clusters are involved does not always get the right result. Example: creating a 60KB image and resizing it (with preallocation=full) to 80KB won't allocate the second cluster. 2) No copy-on-write is performed, so in the previous example if there is a backing file then the first 60KB of the first cluster won't be filled with data from the backing file. This patch fixes both issues. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <20200617140036.20311-1-berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-07-06 08:33:06 +02:00
Eric Blake	5d72c68b49	qcow2: Expose bitmaps' size during measure It's useful to know how much space can be occupied by qcow2 persistent bitmaps, even though such metadata is unrelated to the guest-visible data. Report this value as an additional QMP field, present when measuring an existing image and output format that both support bitmaps. Update iotest 178 and 190 to updated output, as well as new coverage in 190 demonstrating non-zero values made possible with the recently-added qemu-img bitmap command (see `3b51ab4b`). The new 'bitmaps size:' field is displayed automatically as part of 'qemu-img measure' any time it is present in QMP (that is, any time both the source image being measured and destination format support bitmaps, even if the measurement is 0 because there are no bitmaps present). If the field is absent, it means that no bitmaps can be copied (source, destination, or both lack bitmaps, including when measuring based on size rather than on a source image). This behavior is compatible with an upcoming patch adding 'qemu-img convert --bitmaps': that command will fail in the same situations where this patch omits the field. The addition of a new field demonstrates why we should always zero-initialize qapi C structs; while the qcow2 driver still fully populates all fields, the raw and crypto drivers had to be tweaked to avoid uninitialized data. Consideration was also given towards having a 'qemu-img measure --bitmaps' which errors out when bitmaps are not possible, and otherwise sums the bitmaps into the existing allocation totals rather than displaying as a separate field, as a potential convenience factor. But this was ultimately decided to be more complexity than necessary when the QMP interface was sufficient enough with bitmaps remaining a separate field. See also: https://bugzilla.redhat.com/1779904 Reported-by: Nir Soffer <nsoffer@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200521192137.1120211-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-05-28 13:16:16 -05:00
Eric Blake	ef893b5c84	block: Make it easier to learn which BDS support bitmaps Upcoming patches will enhance bitmap support in qemu-img, but in doing so, it turns out to be nice to suppress output when persistent bitmaps make no sense (such as on a qcow2 v2 image). Add a hook to make this easier to query. This patch adds a new callback .bdrv_supports_persistent_dirty_bitmap, rather than trying to shoehorn the answer in via existing callbacks. In particular, while it might have been possible to overload .bdrv_co_can_store_new_dirty_bitmap to special-case a NULL input to answer whether any persistent bitmaps are supported, that is at odds with whether a particular bitmap can be stored (for example, even on an image that supports persistent bitmaps but has currently filled up the maximum number of bitmaps, attempts to store another one should fail); and the new functionality doesn't require coroutine safety. Similarly, we could have added one more piece of information to .bdrv_get_info, but then again, most callers to that function tend to already discard extraneous information, and making it a catch-all rather than a series of dedicated scalar queries hasn't really simplified life. In the future, when we improve the ability to look up bitmaps through a filter, we will probably also want to teach the block layer to automatically let filters pass this request on through. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513011648.166876-4-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2020-05-19 10:32:14 -05:00
Max Reitz	69dca43d6b	block: Use bdrv_default_perms() bdrv_default_perms() can decide which permission profile to use based on the BdrvChildRole, so block drivers do not need to select it explicitly. The blkverify driver now no longer shares the WRITE permission for the image to verify. We thus have to adjust two places in test-block-iothread not to take it. (Note that in theory, blkverify should behave like quorum in this regard and share neither WRITE nor RESIZE for both of its children. In practice, it does not really matter, because blkverify is used only for debugging, so we might as well keep its permissions rather liberal.) Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-30-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	8b1869daad	block: Make format drivers use child_of_bds Commonly, they need to pass the BDRV_CHILD_IMAGE set as the BdrvChildRole; but there are exceptions for drivers with external data files (qcow2 and vmdk). Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-26-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	258b776515	block: Add BdrvChildRole to BdrvChild For now, it is always set to 0. Later patches in this series will ensure that all callers pass an appropriate combination of flags. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20200513110544.176672-6-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Max Reitz	d67066d8bc	block: Add BlockDriver.is_format We want to unify child_format and child_file at some point. One of the important things that set format drivers apart from other drivers is that they do not expect other format nodes under them (except in the backing chain), i.e. we must not probe formats inside of formats. That means we need something on which to distinguish format drivers from others, and hence this flag. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-Id: <20200513110544.176672-3-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-18 19:05:25 +02:00
Denis Plotnikov	d298ac10ad	qcow2: add zstd cluster compression zstd significantly reduces cluster compression time. It provides better compression performance maintaining the same level of the compression ratio in comparison with zlib, which, at the moment, is the only compression method available. The performance test results: Test compresses and decompresses qemu qcow2 image with just installed rhel-7.6 guest. Image cluster size: 64K. Image on disk size: 2.2G The test was conducted with brd disk to reduce the influence of disk subsystem to the test results. The results is given in seconds. compress cmd: time ./qemu-img convert -O qcow2 -c -o compression_type=[zlib\|zstd] src.img [zlib\|zstd]_compressed.img decompress cmd time ./qemu-img convert -O qcow2 [zlib\|zstd]_compressed.img uncompressed.img compression decompression zlib zstd zlib zstd ------------------------------------------------------------ real 65.5 16.3 (-75 %) 1.9 1.6 (-16 %) user 65.0 15.8 5.3 2.5 sys 3.3 0.2 2.0 2.0 Both ZLIB and ZSTD gave the same compression ratio: 1.57 compressed image size in both cases: 1.4G Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com> QAPI part: Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200507082521.29210-4-dplotnikov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-05-13 14:20:31 +02:00
Denis Plotnikov	572ad9783f	qcow2: introduce compression type feature The patch adds some preparation parts for incompatible compression type feature to qcow2 allowing the use different compression methods for image clusters (de)compressing. It is implied that the compression type is set on the image creation and can be changed only later by image conversion, thus compression type defines the only compression algorithm used for the image, and thus, for all image clusters. The goal of the feature is to add support of other compression methods to qcow2. For example, ZSTD which is more effective on compression than ZLIB. The default compression is ZLIB. Images created with ZLIB compression type are backward compatible with older qemu versions. Adding of the compression type breaks a number of tests because now the compression type is reported on image creation and there are some changes in the qcow2 header in size and offsets. The tests are fixed in the following ways: * filter out compression_type for many tests * fix header size, feature table size and backing file offset affected tests: 031, 036, 061, 080 header_size +=8: 1 byte compression type 7 bytes padding feature_table += 48: incompatible feature compression type backing_file_offset += 56 (8 + 48 -> header_change + feature_table_change) * add "compression type" for test output matching when it isn't filtered affected tests: 049, 060, 061, 065, 082, 085, 144, 182, 185, 198, 206, 242, 255, 274, 280 Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> QAPI part: Acked-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20200507082521.29210-2-dplotnikov@virtuozzo.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-05-13 14:20:31 +02:00
Eric Blake	47e0b38a13	block: Drop unused .bdrv_has_zero_init_truncate Now that there are no clients of bdrv_has_zero_init_truncate, none of the drivers need to worry about providing it. What's more, this eliminates a source of some confusion: a literal reading of the documentation as written in `ceaca56f` and implemented in commit `1dcaf527` claims that a driver which returns 0 for bdrv_has_zero_init_truncate() must not return 1 for bdrv_has_zero_init(); this condition was violated for parallels, qcow, and sometimes for vdi, although in practice it did not matter since those drivers also lacked .bdrv_co_truncate. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200428202905.770727-10-eblake@redhat.com> Acked-by: Richard W.M. Jones <rjones@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-08 13:26:35 +02:00
Max Reitz	4b96fa3846	qcow2: Fix preallocation on block devices Calling bdrv_getlength() to get the pre-truncate file size will not really work on block devices, because they have always the same length, and trying to write beyond it will fail with a rather cryptic error message. Instead, we should use qcow2_get_last_cluster() and bdrv_getlength() only as a fallback. Before this patch: $ truncate -s 1G test.img $ sudo losetup -f --show test.img /dev/loop0 $ sudo qemu-img create -f qcow2 -o preallocation=full /dev/loop0 64M Formatting '/dev/loop0', fmt=qcow2 size=67108864 cluster_size=65536 preallocation=full lazy_refcounts=off refcount_bits=16 qemu-img: /dev/loop0: Could not resize image: Failed to resize refcount structures: No space left on device With this patch: $ sudo qemu-img create -f qcow2 -o preallocation=full /dev/loop0 64M Formatting '/dev/loop0', fmt=qcow2 size=67108864 cluster_size=65536 preallocation=full lazy_refcounts=off refcount_bits=16 qemu-img: /dev/loop0: Could not resize image: Failed to resize underlying file: Preallocation mode 'full' unsupported for this non-regular file So as you can see, it still fails, but now the problem is missing support on the block device level, so we at least get a better error message. Note that we cannot preallocate block devices on truncate by design, because we do not know what area to preallocate. Their length is always the same, the truncate operation does not change it. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200505141801.1096763-1-mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-08 13:26:35 +02:00
Alberto Garcia	e4d7019e1a	qcow2: Avoid integer wraparound in qcow2_co_truncate() After commit `f01643fb8b` when an image is extended and BDRV_REQ_ZERO_WRITE is set then the new clusters are zeroized. The code however does not detect correctly situations when the old and the new end of the image are within the same cluster. The problem can be reproduced with these steps: qemu-img create -f qcow2 backing.qcow2 1M qemu-img create -f qcow2 -F qcow2 -b backing.qcow2 top.qcow2 qemu-img resize --shrink top.qcow2 520k qemu-img resize top.qcow2 567k In the last step offset - zero_start causes an integer wraparound. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-Id: <20200504155217.10325-1-berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-05-08 13:26:35 +02:00
Eric Blake	ee1244a2e9	qcow2: Tweak comment about bitmaps vs. resize Our comment did not actually match the code. Rewrite the comment to be less sensitive to any future changes to qcow2-bitmap.c that might implement scenarios that we currently reject. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200428192648.749066-4-eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-05-05 13:17:36 +02:00
Eric Blake	7fa140abf6	qcow2: Allow resize of images with internal snapshots We originally refused to allow resize of images with internal snapshots because the v2 image format did not require the tracking of snapshot size, making it impossible to safely revert to a snapshot with a different size than the current view of the image. But the snapshot size tracking was rectified in v3, and our recent fixes to qemu-img amend (see `0a85af35`) guarantee that we always have a valid snapshot size. Thus, we no longer need to artificially limit image resizes, but it does become one more thing that would prevent a downgrade back to v2. And now that we support different-sized snapshots, it's also easy to fix reverting to a snapshot to apply the new size. Upgrade iotest 61 to cover this (we previously had NO coverage of refusal to resize while snapshots exist). Note that the amend process can fail but still have effects: in particular, since we break things into upgrade, resize, downgrade, a failure during resize does not roll back changes made during upgrade, nor does failure in downgrade roll back a resize. But this situation is pre-existing even without this patch; and without journaling, the best we could do is minimize the chance of partial failure by collecting all changes prior to doing any writes - which adds a lot of complexity but could still fail with EIO. On the other hand, we are careful that even if we have partial modification but then fail, the image is left viable (that is, we are careful to sequence things so that after each successful cluster write, there may be transient leaked clusters but no corrupt metadata). And complicating the code to make it more transaction-like is not worth the effort: a user can always request multiple 'qemu-img amend' changing one thing each, if they need finer-grained control over detecting the first failure than what they get by letting qemu decide how to sequence multiple changes. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200428192648.749066-3-eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-05-05 13:17:36 +02:00
Eric Blake	a3aeeab557	block: Add blk_new_with_bs() helper There are several callers that need to create a new block backend from an existing BDS; make the task slightly easier with a common helper routine. Suggested-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20200424190903.522087-2-eblake@redhat.com> [mreitz: Set @ret only in error paths, see https://lists.nongnu.org/archive/html/qemu-block/2020-04/msg01216.html] Signed-off-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200428192648.749066-2-eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2020-05-05 13:17:36 +02:00
Kevin Wolf	eb8a0cf3ba	qcow2: Forward ZERO_WRITE flag for full preallocation The BDRV_REQ_ZERO_WRITE is currently implemented in a way that first the image is possibly preallocated and then the zero flag is added to all clusters. This means that a copy-on-write operation may be needed when writing to these clusters, despite having used preallocation, negating one of the major benefits of preallocation. Instead, try to forward the BDRV_REQ_ZERO_WRITE to the protocol driver, and if the protocol driver can ensure that the new area reads as zeros, we can skip setting the zero flag in the qcow2 layer. Unfortunately, the same approach doesn't work for metadata preallocation, so we'll still set the zero flag there. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-Id: <20200424142701.67053-1-kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-04-30 17:51:07 +02:00

1 2 3 4 5 ...

681 Commits