qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Peter Lieven	f1a7ff770f	block/nfs: fix nfs_client_open for filesize greater than 1TB DIV_ROUND_UP(st.st_size, BDRV_SECTOR_SIZE) was overflowing ret (int) if st.st_size is greater than 1TB. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Lieven <pl@kamp.de> Message-id: 1511798407-31129-1-git-send-email-pl@kamp.de Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-29 15:28:15 +01:00
Paolo Bonzini	5bf1d5a73a	blockjob: remove clock argument from block_job_sleep_ns All callers are using QEMU_CLOCK_REALTIME, and it will not be possible to support more than one clock when block_job_sleep_ns switches to a single timer stored in the BlockJob struct. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Tested-By: Jeff Cody <jcody@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-11-29 15:11:02 +01:00
Kevin Wolf	02d213009d	block: Expect graph changes in bdrv_parent_drained_begin/end The .drained_begin/end callbacks can (directly or indirectly via aio_poll()) cause block nodes to be removed or the current BdrvChild to point to a different child node. Use QLIST_FOREACH_SAFE() to make sure we don't access invalid BlockDriverStates or accidentally continue iterating the parents of the new child node instead of the node we actually came from. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Tested-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-11-29 14:22:03 +01:00
Kevin Wolf	70a5afedd6	block: Error out on load_vm with active dirty bitmaps Loading a snapshot invalidates the bitmap. Just marking all blocks dirty is not a useful response in practice, instead the user needs to be aware that we switch to a completely different state. If they are okay with losing the dirty bitmap, they can just explicitly delete it. This effectively reverts commit `04dec3c3ae`. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-11-21 14:48:23 +01:00
Kevin Wolf	2b624fe079	block: Add errp to bdrv_all_goto_snapshot() Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-11-21 14:48:22 +01:00
Kevin Wolf	0b62bcbc61	block: Add errp to bdrv_snapshot_goto() Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com>	2017-11-21 14:48:22 +01:00
Kevin Wolf	1f4ad7d3b8	block: Don't request I/O permission with BDRV_O_NO_IO 'qemu-img info' makes sense even when BLK_PERM_CONSISTENT_READ cannot be granted because of a block job in a running qemu process. It already sets BDRV_O_NO_IO to indicate that it doesn't access the guest visible data at all. Check the BDRV_O_NO_IO flags in blk_new_open(), so that I/O related permissions are not unnecessarily requested and 'qemu-img info' can work even if BLK_PERM_CONSISTENT_READ cannot be granted. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com>	2017-11-21 14:48:22 +01:00
Peter Maydell	2e02083438	Block layer patches for 2.11.0-rc2 -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJaDyNMAAoJEH8JsnLIjy/WP+oP/1G0r1b9PBtrEc+B9QG0sMh1 qivbMRmNUFhuzYW8VBzb2Cpsl/lpe/VlAumf75RUACOh4ZfxkOhaYTMB5x9FHvoa 5w35UC382zuXTMfeF4JufF5rVJdB+bB9THJOJ7cuaDQSHZNPo8uMmHrk4J5p5uuU IrOZLfqgQaWGdobGe6FPWy67z2PRWrzEI/Ogr8zwpa95GJhX3fY1CmGvt2MUBG+j cwdYcuX9pawFfjasCkfwYMzY7Uc9wvjIJYSe+WvrDrpjInISgxMOvXylzdWkpsGS rHnsLgRXnUE4BmOavzq+uvlo14hlrmdVn5UYlUkbJcx+Bkp7MZGuNjX6doimCdWg PpbQB3TIi7ce4v9cYWZSrzDDo6E/d1Evi3xMG0ZouU0ff1DPFIilm8lxb3xYia+b ItTLZ7yXWLijGhr4jTeGVhn0+zENiczyiQsL1sDdj83VnpAOGPNZTwtnNGug8ATJ VELwA7jlVEL47HYovQoUIFs0NA5xCbOQm5avS5gk44PB/dNkYGKdHhwgQUV/R36A lRNj7Vh7hwHSyHO/gKi3sqXTbNG/fTDSy1Nu1xc0pAROIKpc6AanP0mFn2DlZS35 kyZbxMbuGUsDQ2j0pcGuP60O7+2CY4jlXRCXo6vaHzP3xDfOTx2L9UYVfk/dwNJu uYbnbXWv06igjm3Y/2Vv =ClZj -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging Block layer patches for 2.11.0-rc2 # gpg: Signature made Fri 17 Nov 2017 17:58:36 GMT # gpg: using RSA key 0x7F09B272C88F2FD6 # gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" # Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6 * remotes/kevin/tags/for-upstream: (25 commits) iotests: Make 087 pass without AIO enabled block: Make bdrv_next() keep strong references qcow2: Fix overly broad madvise() qcow2: Refuse to get unaligned offsets from cache qcow2: Add bounds check to get_refblock_offset() block: Guard against NULL bs->drv qcow2: Unaligned zero cluster in handle_alloc() qcow2: check_errors are fatal qcow2: reject unaligned offsets in write compressed iotests: Add test for failing qemu-img commit tests: Add check-qobject for equality tests iotests: Add test for non-string option reopening block: qobject_is_equal() in bdrv_reopen_prepare() qapi: Add qobject_is_equal() qapi/qlist: Add qlist_append_null() macro qapi/qnull: Add own header qcow2: fix image corruption on commit with persistent bitmap iotests: test clearing unknown autoclear_features by qcow2 block: Fix permissions in image activation qcow2: fix image corruption after committing qcow2 image into base ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-11-17 19:08:07 +00:00
Max Reitz	5e003f17ec	block: Make bdrv_next() keep strong references On one hand, it is a good idea for bdrv_next() to return a strong reference because ideally nearly every pointer should be refcounted. This fixes intermittent failure of iotest 194. On the other, it is absolutely necessary for bdrv_next() itself to keep a strong reference to both the BB (in its first phase) and the BDS (at least in the second phase) because when called the next time, it will dereference those objects to get a link to the next one. Therefore, it needs these objects to stay around until then. Just storing the pointer to the next in the iterator is not really viable because that pointer might become invalid as well. Both arguments taken together means we should probably just invoke bdrv_ref() and blk_ref() in bdrv_next(). This means we have to assert that bdrv_next() is always called from the main loop, but that was probably necessary already before this patch and judging from the callers, it also looks to actually be the case. Keeping these strong references means however that callers need to give them up if they decide to abort the iteration early. They can do so through the new bdrv_next_cleanup() function. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110172545.32609-1-mreitz@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-17 18:21:31 +01:00
Max Reitz	08546bcfb2	qcow2: Fix overly broad madvise() @mem_size and @offset are both size_t, thus subtracting them from one another will just return a big size_t if mem_size < offset -- even more obvious here because the result is stored in another size_t. Checking that result to be positive is therefore not sufficient to exclude the case that offset > mem_size. Thus, we currently sometimes issue an madvise() over a very large address range. This is triggered by iotest 163, but with -m64, this does not result in tangible problems. But with -m32, this test produces three segfaults, all of which are fixed by this patch. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171114184127.24238-1-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Darren Kenny <darren.kenny@oracle.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-17 18:21:31 +01:00
Max Reitz	4efb1f7c61	qcow2: Refuse to get unaligned offsets from cache Instead of using an assertion, it is better to emit a corruption event here. Checking all offsets for correct alignment can be tedious and it is easily possible to forget to do so. qcow2_cache_do_get() is a function every L2 and refblock access has to go through, so this is a good central point to add such a check. And for good measure, let us also add an assertion that the offset is non-zero. Making this a corruption event is not feasible, because a zero offset usually means something special (such as the cluster is unused), so all callers should be checking this anyway. If they do not, it is their fault, hence the assertion here. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-6-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-17 18:21:31 +01:00
Max Reitz	23482f8a60	qcow2: Add bounds check to get_refblock_offset() Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1728661 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-5-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-17 18:21:31 +01:00
Max Reitz	d470ad42ac	block: Guard against NULL bs->drv We currently do not guard everywhere against a NULL bs->drv where we should be doing so. Most of the places fixed here just do not care about that case at all. Some care implicitly, e.g. through a prior function call to bdrv_getlength() which would always fail for an ejected BDS. Add an assert there to make it more obvious. Other places seem to care, but do so insufficiently: Freeing clusters in a qcow2 image is an error-free operation, but it may leave the image in an unusable state anyway. Giving qcow2_free_clusters() an error code is not really viable, it is much easier to note that bs->drv may be NULL even after a successful driver call. This concerns bdrv_co_flush(), and the way the check is added to bdrv_co_pdiscard() (in every iteration instead of only once). Finally, some places employ at least an assert(bs->drv); somewhere, that may be reasonable (such as in the reopen code), but in bdrv_has_zero_init(), it is definitely not. Returning 0 there in case of an ejected BDS saves us much headache instead. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1728660 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-4-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-17 18:21:31 +01:00
Max Reitz	93bbaf03ff	qcow2: Unaligned zero cluster in handle_alloc() We should check whether the cluster offset we are about to use is actually valid; that is, whether it is aligned to cluster boundaries. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1728643 Buglink: https://bugs.launchpad.net/qemu/+bug/1728657 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-17 18:21:30 +01:00
Max Reitz	791fff504c	qcow2: check_errors are fatal When trying to repair a dirty image, qcow2_check() may apparently succeed (no really fatal error occurred that would prevent the check from continuing), but if check_errors in the result object is non-zero, we cannot trust the image to be usable. Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1728639 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171110203111.7666-2-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-17 18:21:30 +01:00
Anton Nefedov	3e3b838ffe	qcow2: reject unaligned offsets in write compressed Misaligned compressed write is not supported. Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Message-id: 1510654613-47868-2-git-send-email-anton.nefedov@virtuozzo.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-17 18:21:30 +01:00
Eric Blake	4096974e18	qcow2: fix image corruption on commit with persistent bitmap If an image contains persistent bitmaps, we cannot use the fast path of bdrv_make_empty() to clear the image during qemu-img commit, because that will lose the clusters related to the bitmaps. Also leave a comment in qcow2_read_extensions to remind future feature additions to think about fast-path removal, since we just barely fixed the same bug for LUKS encryption. It's a pain that qemu-img has not yet been taught to manipulate, or even at a very minimum display, information about persistent bitmaps; instead, we have to use QMP commands. It's also a pain that only qeury-block and x-debug-block-dirty-bitmap-sha256 will allow bitmap introspection; but the former requires the node to be hooked to a block device, and the latter is experimental. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-11-17 18:21:01 +01:00
Eric Blake	08ace1d753	nbd: Don't crash when server reports NBD_CMD_READ failure If a server fails a read, for example with EIO, but the connection is still live, then we would crash trying to print a non-existent error message in nbd_client_co_preadv(). For consistency, also change the error printout in nbd_read_reply_entry(), although that instance does not crash. Bug introduced in commit `f140e300`. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171112013936.5942-1-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2017-11-17 08:02:45 -06:00
Daniel P. Berrange	f06033295b	qcow2: fix image corruption after committing qcow2 image into base After committing the qcow2 image contents into the base image, qemu-img will call bdrv_make_empty to drop the payload in the layered image. When this is done for qcow2 images, it blows away the LUKS encryption header, making the resulting image unusable. There are two codepaths for emptying a qcow2 image, and the second (slower) codepath leaves the LUKS header intact, so force use of that codepath. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-11-17 13:36:03 +01:00
Kevin Wolf	398e6ad014	block: Deprecate bdrv_set_read_only() and users bdrv_set_read_only() is used by some block drivers to override the read-only option given by the user. This is not how read-only images generally work in QEMU: Instead of second guessing what the user really meant (which currently includes making an image read-only even if the user didn't only use the default, but explicitly said read-only=off), we should error out if we can't provide what the user requested. This adds deprecation warnings to all callers of bdrv_set_read_only() so that the behaviour can be corrected after the usual deprecation period. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-11-17 13:35:59 +01:00
Daniel P. Berrange	f66afbe26f	qcow2: don't permit changing encryption parameters Currently if trying to change encryption parameters on a qcow2 image, qemu-img will abort. We already explicitly check for attempt to change encrypt.format but missed other parameters like encrypt.key-secret. Rather than list each parameter, just blacklist changing of all parameters with a 'encrypt.' prefix. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-11-17 13:35:59 +01:00
Wang Guang	611e0653ad	replication: Fix replication open fail replication_child_perm request write permissions for all child which will lead bdrv_check_perm fail. replication_child_perm() should request write permissions only if it is writable itself. Signed-off-by: Wang Guang <wang.guang55@zte.com.cn> Signed-off-by: Wang Yong <wang.yong155@zte.com.cn> Reviewed-by: Xie Changlong <xiechanglong@cmss.chinamobile.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-11-17 13:35:59 +01:00
Stefan Hajnoczi	341e0b5658	throttle-groups: forget timer and schedule next TGM on detach tg->any_timer_armed[] must be cleared when detaching pending timers from the AioContext. Failure to do so leads to hung I/O because it looks like there are still timers pending when in fact they have been removed. Other ThrottleGroupMembers might have requests pending too so it's necessary to schedule the next TGM so it can set a timer. This patch fixes hung I/O when QEMU is launched with drives that are in the same throttling group: (guest)$ dd if=/dev/zero of=/dev/vdb oflag=direct bs=512 & (guest)$ dd if=/dev/zero of=/dev/vdc oflag=direct bs=512 & (qemu) stop (qemu) cont ...I/O is stuck... Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20171116112150.27607-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-11-16 14:12:57 +00:00
Jeff Cody	1d0f37cf21	block/parallels: add migration blocker Migration does not work for parallels, and has been broken for a while (see patch 'block/parallels: Do not update header or truncate image when INMIGRATE'). The bdrv_invalidate_cache() method needs to be added for migration to be supported. Until this is done, prohibit migration. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 5e04a7c8a3089913fa58d484af42dab7993984ad.1510059970.git.jcody@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:26 +01:00
Jeff Cody	6c7d390b99	block/parallels: Do not update header or truncate image when INMIGRATE If we write or modify the image file while the QEMU run state is INMIGRATE, then the BDRV_O_INACTIVE BDS flag is set. This will cause an assert, since the image is marked inactive. Make sure we obey this flag. Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Jeff Cody <jcody@redhat.com> Message-id: 3996c930fa8cde8570b7a63032720d76a28fd78b.1510059970.git.jcody@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Jeff Cody	7479bf07c4	block/vhdx.c: Don't blindly update the header The VHDX specification requires that before user data modification of the vhdx image, the VHDX header file and data GUIDs need to be updated. In vhdx_open(), if the image is set to RDWR, we go ahead and update the header. However, just because the image is set to RDWR does not mean we can go ahead and write at this point - specifically, if the QEMU run state is INMIGRATE, the underlying file BS may be set to inactive via the BDS open flag of BDRV_O_INACTIVE. Attempting to write under this condition will cause an assert in bdrv_co_pwritev(). We can alternatively latch the first time the image is written. And lo and behold, we do just that, via vhdx_user_visible_write() in vhdx_co_writev(). This means the call to vhdx_update_headers() in vhdx_open() is likely just vestigial, and can be removed. Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Jeff Cody <jcody@redhat.com> Message-id: 659e4cdba6ef4c651737852777c8c93d27b38040.1510059970.git.jcody@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Vladimir Sementsov-Ogievskiy	04dec3c3ae	block/snapshot: dirty all dirty bitmaps on snapshot-switch Snapshot-switch actually changes active state of disk so it should reflect on dirty bitmaps. Otherwise next incremental backup using these bitmaps will be invalid. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20171023092945.54532-1-vsementsov@virtuozzo.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Alberto Garcia	c9b83e9c23	qcow2: Assert that the crypto header does not overlap other metadata The crypto header is initialized only when QEMU is creating a new image, so there's no chance of this happening on a corrupted image. If QEMU is really trying to allocate the header overlapping other existing metadata sections then this is a serious bug in QEMU itself so let's add an assertion. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: ae3d77f312fc0c5e0ac2bbd71676c0112eebe2e5.1509718618.git.berto@igalia.com Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Alberto Garcia	951053a9ec	qcow2: Don't open images with header.refcount_table_clusters == 0 qcow2_do_open() is checking that header.refcount_table_clusters is not too large, but it doesn't check that it's greater than zero. Apart from the fact that an image like that is obviously corrupted, trying to use it crashes QEMU since we end up with a null s->refcount_table after qcow2_refcount_init(). These images can however be repaired, so allow opening them if the BDRV_O_CHECK flag is set. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: f9750f50c80359babba11062e88f5075a47e8e16.1509718618.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Alberto Garcia	8aa34834d5	qcow2: Prevent allocating compressed clusters at offset 0 If the refcount data is corrupted then we can end up trying to allocate a new compressed cluster at offset 0 in the image, triggering an assertion in qcow2_alloc_bytes() that would crash QEMU: qcow2_alloc_bytes: Assertion `offset' failed. This patch adds an explicit check for this scenario and a new test case. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: fb53467cf48e95ff3330def1cf1003a5b862b7d9.1509718618.git.berto@igalia.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Alberto Garcia	9883975050	qcow2: Prevent allocating L2 tables at offset 0 If the refcount data is corrupted then we can end up trying to allocate a new L2 table at offset 0 in the image, triggering an assertion in the qcow2 cache that would crash QEMU: qcow2_cache_entry_mark_dirty: Assertion `c->entries[i].offset != 0' failed This patch adds an explicit check for this scenario and a new test case. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 92dac37191ae7844a2da22c122204eb493cc3133.1509718618.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Alberto Garcia	6bf45d59f9	qcow2: Prevent allocating refcount blocks at offset 0 Each entry in the qcow2 cache contains an offset field indicating the location of the data in the qcow2 image. If the offset is 0 then it means that the entry contains no data and is available to be used when needed. Because of that it is not possible to store in the cache the first cluster of the qcow2 image (offset = 0). This is not a problem because that cluster always contains the qcow2 header and we're not using this cache for that. However, if the qcow2 image is corrupted it can happen that we try to allocate a new refcount block at offset 0, triggering this assertion and crashing QEMU: qcow2_cache_entry_mark_dirty: Assertion `c->entries[i].offset != 0' failed This patch adds an explicit check for this scenario and a new test case. This problem was originally reported here: https://bugs.launchpad.net/qemu/+bug/1728615 Reported-by: R.Nageswara Sastry <nasastry@in.ibm.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 92a2fadd10d58b423f269c1d1a309af161cdc73f.1509718618.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Peter Maydell	191b5fbfa6	Pull request The following disk I/O throttling fixes solve recent bugs. -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJaCsdYAAoJEJykq7OBq3PIZdMH/10xOuxOjvjxlJNkQquhrAmD y9dVj0jEtopdter/XR7ZCsww1UgxpIt8K43Dk1yWTKrm2bNN1v3cqemJV+UUTLFl LppKxt5Cm1JRKaCfN0hSwOp5pFJumzH6creVdQMQ3VNCSSw6xfV94pupaVE8at6D n4r3ZDF03ARETMJW7HY7QIFi1YVcfmi4wrx8rfhEGLZu06nHrtFQsDdH7SeErgXi wJh+ksji4EvX2xc54nhprCsc9HdzbfeBEYx6tdD0Uh3xm7xXd2oka5Rac74WuqYu B4aKwyFbvKZ0DYnENiOCkemTN51s+0GHLz43T92/DmQhJrBy8EU4TTCn73vgmto= =KnUT -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging Pull request The following disk I/O throttling fixes solve recent bugs. # gpg: Signature made Tue 14 Nov 2017 10:37:12 GMT # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: qemu-iotests: Test I/O limits with removable media block: Leave valid throttle timers when removing a BDS from a backend block: Check for inserted BlockDriverState in blk_io_limits_disable() throttle-groups: drain before detaching ThrottleState block: all I/O should be completed before removing throttle timers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-11-14 16:11:19 +00:00
Alberto Garcia	c89bcf3af0	block: Leave valid throttle timers when removing a BDS from a backend If a BlockBackend has I/O limits set then its ThrottleGroupMember structure uses the AioContext from its attached BlockDriverState. Those two contexts must be kept in sync manually. This is not ideal and will be fixed in the future by removing the throttling configuration from the BlockBackend and storing it in an implicit filter node instead, but for now we have to live with this. When you remove the BlockDriverState from the backend then the throttle timers are destroyed. If a new BlockDriverState is later inserted then they are created again using the new AioContext. There are a couple of problems with this: a) The code manipulates the timers directly, leaving the ThrottleGroupMember.aio_context field in an inconsisent state. b) If you remove the I/O limits (e.g by destroying the backend) when the timers are gone then throttle_group_unregister_tgm() will attempt to destroy them again, crashing QEMU. While b) could be fixed easily by allowing the timers to be freed twice, this would result in a situation in which we can no longer guarantee that a valid ThrottleState has a valid AioContext and timers. This patch ensures that the timers and AioContext are always valid when I/O limits are set, regardless of whether the BlockBackend has a BlockDriverState inserted or not. [Fixed "There'a" typo as suggested by Max Reitz <mreitz@redhat.com> --Stefan] Reported-by: sochin jiang <sochin.jiang@huawei.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: e089c66e7c20289b046d782cea4373b765c5bc1d.1510339534.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-11-13 15:43:49 +00:00
Alberto Garcia	48bf7ea81a	block: Check for inserted BlockDriverState in blk_io_limits_disable() When you set I/O limits using block_set_io_throttle or the command line throttling.* options they are kept in the BlockBackend regardless of whether a BlockDriverState is attached to the backend or not. Therefore when removing the limits using blk_io_limits_disable() we need to check if there's a BDS before attempting to drain it, else it will crash QEMU. This can be reproduced very easily using HMP: (qemu) drive_add 0 if=none,throttling.iops-total=5000 (qemu) drive_del none0 Reported-by: sochin jiang <sochin.jiang@huawei.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 0d3a67ce8d948bb33e08672564714dcfb76a3d8c.1510339534.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-11-13 14:38:46 +00:00
Stefan Hajnoczi	dc868fb03b	throttle-groups: drain before detaching ThrottleState I/O requests hang after stop/cont commands at least since QEMU 2.10.0 with -drive iops=100: (guest)$ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000 (qemu) stop (qemu) cont ...I/O is stuck... This happens because blk_set_aio_context() detaches the ThrottleState while requests may still be in flight: if (tgm->throttle_state) { throttle_group_detach_aio_context(tgm); throttle_group_attach_aio_context(tgm, new_context); } This patch encloses the detach/attach calls in a drained region so no I/O request is left hanging. Also add assertions so we don't make the same mistake again in the future. Reported-by: Yongxue Hong <yhong@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20171110151934.16883-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-11-13 14:02:09 +00:00
Zhengui	632a773543	block: all I/O should be completed before removing throttle timers. In blk_remove_bs, all I/O should be completed before removing throttle timers. If there has inflight I/O, removing throttle timers here will cause the inflight I/O never return. This patch add bdrv_drained_begin before throttle_timers_detach_aio_context to let all I/O completed before removing throttle timers. [Moved declaration of bs as suggested by Alberto Garcia <berto@igalia.com>. --Stefan] Signed-off-by: Zhengui <lizhengui@huawei.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1508564040-120700-1-git-send-email-lizhengui@huawei.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-11-13 14:02:05 +00:00
Eric Blake	b4176cb314	nbd-client: Stricter enforcing of structured reply spec Ensure that the server is not sending unexpected chunk lengths for either the NONE or the OFFSET_DATA chunk, nor unexpected hole length for OFFSET_HOLE. This will flag any server as broken that responds to a zero-length read with an OFFSET_DATA (what our server currently does, but that's about to be fixed) or with OFFSET_HOLE, even though we previously fixed our client to never be able to send such a request over the wire. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171108215703.9295-7-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2017-11-09 10:22:26 -06:00
Eric Blake	9d8f818cde	nbd-client: Short-circuit 0-length operations The NBD spec was recently clarified to state that clients should not send 0-length requests to the server, as the server behavior is undefined [1]. We know that qemu-nbd's behavior is a successful no-op (once it has filtered for read-only exports), but other NBD implementations might return an error. To avoid any questionable server implementations, it is better to just short-circuit such requests on the client side (we are relying on the block layer to already filter out requests such as invalid offset, write to a read-only volume, and so forth); do the short-circuit as late as possible to still benefit from protections from assertions that the block layer is not violating our assumptions. [1] https://github.com/NetworkBlockDevice/nbd/commit/ee926037 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171108215703.9295-6-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2017-11-09 10:18:31 -06:00
Eric Blake	1104d83c72	nbd-client: Refuse read-only client with BDRV_O_RDWR The NBD spec says that clients should not try to write/trim to an export advertised as read-only by the server. But we failed to check that, and would allow the block layer to use NBD with BDRV_O_RDWR even when the server is read-only, which meant we were depending on the server sending a proper EPERM failure for various commands, and also exposes a leaky abstraction: using qemu-io in read-write mode would succeed on 'w -z 0 0' because of local short-circuiting logic, but 'w 0 0' would send a request over the wire (where it then depends on the server, and fails at least for qemu-nbd but might pass for other NBD implementations). With this patch, a client MUST request read-only mode to access a server that is doing a read-only export, or else it will get a message like: can't open device nbd://localhost:10809/foo: request for write access conflicts with read-only export It is no longer possible to even attempt writes over the wire (including the corner case of 0-length writes), because the block layer enforces the explicit read-only request; this matches the behavior of qcow2 when backed by a read-only POSIX file. Fix several iotests to comply with the new behavior (since qemu-nbd of an internal snapshot, as well as nbd-server-add over QMP, default to a read-only export, we must tell blockdev-add/qemu-io to set up a read-only client). CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171108215703.9295-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2017-11-09 10:10:17 -06:00
Eric Blake	e659fb3b99	nbd-client: Fix error message typos Provide missing spaces that are required when using string concatenation to break error messages across source lines. Introduced in commit `f140e300`. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171108215703.9295-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2017-11-09 10:09:38 -06:00
Vladimir Sementsov-Ogievskiy	f140e30003	nbd: Minimal structured read for client Minimal implementation: for structured error only error_report error message. Note that test 83 is now more verbose, because the implementation prints more warnings about unexpected communication errors; perhaps future patches should tone things down by using trace messages instead of traces, but the common case of successful communication is no noisier than before. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171027104037.8319-13-eblake@redhat.com>	2017-10-30 21:48:41 +01:00
Vladimir Sementsov-Ogievskiy	d2febedb45	nbd/client: prepare nbd_receive_reply for structured reply In following patch nbd_receive_reply will be used both for simple and structured reply header receiving. NBDReply is altered into union of simple reply header and structured reply chunk header, simple error translation moved to block/nbd-client to be consistent with further structured reply error translation. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171027104037.8319-11-eblake@redhat.com>	2017-10-30 21:48:32 +01:00
Max Reitz	572b07bea1	qcow2: Always execute preallocate() in a coroutine Some qcow2 functions (at least perform_cow()) expect s->lock to be taken. Therefore, if we want to make use of them, we should execute preallocate() (as "preallocate_co") in a coroutine so that we can use the qemu_co_mutex_* functions. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171009215533.12530-3-mreitz@redhat.com Cc: qemu-stable@nongnu.org Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-26 15:01:14 +02:00
Max Reitz	e400ad1e1f	qcow2: Fix unaligned preallocated truncation A qcow2 image file's length is not required to have a length that is a multiple of the cluster size. However, qcow2_refcount_area() expects an aligned value for its @start_offset parameter, so we need to round @old_file_size up to the next cluster boundary. Reported-by: Ping Li <pingl@redhat.com> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1414049 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171009215533.12530-2-mreitz@redhat.com Cc: qemu-stable@nongnu.org Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-26 15:01:14 +02:00
Max Reitz	233521b199	qcow2: Emit errp when truncating the image tail bdrv_truncate() has an errp parameter which is always set when an error occurs. Let's use that instead of a plain strerror(). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171009155431.14093-1-mreitz@redhat.com Reviewed-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-26 15:01:14 +02:00
Alberto Garcia	a35f87f50d	qcow2: Use BDRV_SECTOR_BITS instead of its literal value BDRV_SECTOR_BITS is defined to be 9 in block.h (and BDRV_SECTOR_SIZE is calculated from that), but there are still a couple of places where we are using the literal value instead of the macro. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 20171009153856.20387-1-berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-26 15:01:13 +02:00
Eric Blake	8cbf74b23c	qcow2: Reduce is_zero() rounding Now that bdrv_is_allocated accepts non-aligned inputs, we can remove the TODO added in earlier refactoring. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	88e63df214	block: Reduce bdrv_aligned_preadv() rounding Now that bdrv_is_allocated accepts non-aligned inputs, we can remove the TODO added in commit `d6a644bb`. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	efa6e2ed64	block: Align block status requests Any device that has request_alignment greater than 512 should be unable to report status at a finer granularity; it may also be simpler for such devices to be guaranteed that the block layer has rounded things out to the granularity boundary (the way the block layer already rounds all other I/O out). Besides, getting the code correct for super-sector alignment also benefits us for the fact that our public interface now has byte granularity, even though none of our drivers have byte-level callbacks. Add an assertion in blkdebug that proves that the block layer never requests status of unaligned sections, similar to what it does on other requests (while still keeping the generic helper in place for when future patches add a throttle driver). Note that iotest 177 already covers this (it would fail if you use just the blkdebug.c hunk without the io.c changes). Meanwhile, we can drop assertions in callers that no longer have to pass in sector-aligned addresses. There is a mid-function scope added for 'count' and 'longret', for a couple of reasons: first, an upcoming patch will add an 'if' statement that checks whether a driver has an old- or new-style callback, and can conveniently use the same scope for less indentation churn at that time. Second, since we are trying to get rid of sector-based computations, wrapping things in a scope makes it easier to group and see what will be deleted in a final cleanup patch once all drivers have been converted to the new-style callback. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00

1 2 3 4 5 ...

3486 Commits