qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Jeff Cody	7479bf07c4	block/vhdx.c: Don't blindly update the header The VHDX specification requires that before user data modification of the vhdx image, the VHDX header file and data GUIDs need to be updated. In vhdx_open(), if the image is set to RDWR, we go ahead and update the header. However, just because the image is set to RDWR does not mean we can go ahead and write at this point - specifically, if the QEMU run state is INMIGRATE, the underlying file BS may be set to inactive via the BDS open flag of BDRV_O_INACTIVE. Attempting to write under this condition will cause an assert in bdrv_co_pwritev(). We can alternatively latch the first time the image is written. And lo and behold, we do just that, via vhdx_user_visible_write() in vhdx_co_writev(). This means the call to vhdx_update_headers() in vhdx_open() is likely just vestigial, and can be removed. Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru> Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Jeff Cody <jcody@redhat.com> Message-id: 659e4cdba6ef4c651737852777c8c93d27b38040.1510059970.git.jcody@redhat.com Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Denis V. Lunev <den@openvz.org> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Vladimir Sementsov-Ogievskiy	04dec3c3ae	block/snapshot: dirty all dirty bitmaps on snapshot-switch Snapshot-switch actually changes active state of disk so it should reflect on dirty bitmaps. Otherwise next incremental backup using these bitmaps will be invalid. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20171023092945.54532-1-vsementsov@virtuozzo.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Alberto Garcia	c9b83e9c23	qcow2: Assert that the crypto header does not overlap other metadata The crypto header is initialized only when QEMU is creating a new image, so there's no chance of this happening on a corrupted image. If QEMU is really trying to allocate the header overlapping other existing metadata sections then this is a serious bug in QEMU itself so let's add an assertion. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: ae3d77f312fc0c5e0ac2bbd71676c0112eebe2e5.1509718618.git.berto@igalia.com Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Alberto Garcia	951053a9ec	qcow2: Don't open images with header.refcount_table_clusters == 0 qcow2_do_open() is checking that header.refcount_table_clusters is not too large, but it doesn't check that it's greater than zero. Apart from the fact that an image like that is obviously corrupted, trying to use it crashes QEMU since we end up with a null s->refcount_table after qcow2_refcount_init(). These images can however be repaired, so allow opening them if the BDRV_O_CHECK flag is set. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: f9750f50c80359babba11062e88f5075a47e8e16.1509718618.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Alberto Garcia	8aa34834d5	qcow2: Prevent allocating compressed clusters at offset 0 If the refcount data is corrupted then we can end up trying to allocate a new compressed cluster at offset 0 in the image, triggering an assertion in qcow2_alloc_bytes() that would crash QEMU: qcow2_alloc_bytes: Assertion `offset' failed. This patch adds an explicit check for this scenario and a new test case. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: fb53467cf48e95ff3330def1cf1003a5b862b7d9.1509718618.git.berto@igalia.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Alberto Garcia	9883975050	qcow2: Prevent allocating L2 tables at offset 0 If the refcount data is corrupted then we can end up trying to allocate a new L2 table at offset 0 in the image, triggering an assertion in the qcow2 cache that would crash QEMU: qcow2_cache_entry_mark_dirty: Assertion `c->entries[i].offset != 0' failed This patch adds an explicit check for this scenario and a new test case. Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 92dac37191ae7844a2da22c122204eb493cc3133.1509718618.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Alberto Garcia	6bf45d59f9	qcow2: Prevent allocating refcount blocks at offset 0 Each entry in the qcow2 cache contains an offset field indicating the location of the data in the qcow2 image. If the offset is 0 then it means that the entry contains no data and is available to be used when needed. Because of that it is not possible to store in the cache the first cluster of the qcow2 image (offset = 0). This is not a problem because that cluster always contains the qcow2 header and we're not using this cache for that. However, if the qcow2 image is corrupted it can happen that we try to allocate a new refcount block at offset 0, triggering this assertion and crashing QEMU: qcow2_cache_entry_mark_dirty: Assertion `c->entries[i].offset != 0' failed This patch adds an explicit check for this scenario and a new test case. This problem was originally reported here: https://bugs.launchpad.net/qemu/+bug/1728615 Reported-by: R.Nageswara Sastry <nasastry@in.ibm.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 92a2fadd10d58b423f269c1d1a309af161cdc73f.1509718618.git.berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-11-14 18:06:25 +01:00
Peter Maydell	191b5fbfa6	Pull request The following disk I/O throttling fixes solve recent bugs. -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJaCsdYAAoJEJykq7OBq3PIZdMH/10xOuxOjvjxlJNkQquhrAmD y9dVj0jEtopdter/XR7ZCsww1UgxpIt8K43Dk1yWTKrm2bNN1v3cqemJV+UUTLFl LppKxt5Cm1JRKaCfN0hSwOp5pFJumzH6creVdQMQ3VNCSSw6xfV94pupaVE8at6D n4r3ZDF03ARETMJW7HY7QIFi1YVcfmi4wrx8rfhEGLZu06nHrtFQsDdH7SeErgXi wJh+ksji4EvX2xc54nhprCsc9HdzbfeBEYx6tdD0Uh3xm7xXd2oka5Rac74WuqYu B4aKwyFbvKZ0DYnENiOCkemTN51s+0GHLz43T92/DmQhJrBy8EU4TTCn73vgmto= =KnUT -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging Pull request The following disk I/O throttling fixes solve recent bugs. # gpg: Signature made Tue 14 Nov 2017 10:37:12 GMT # gpg: using RSA key 0x9CA4ABB381AB73C8 # gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" # gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>" # Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8 * remotes/stefanha/tags/block-pull-request: qemu-iotests: Test I/O limits with removable media block: Leave valid throttle timers when removing a BDS from a backend block: Check for inserted BlockDriverState in blk_io_limits_disable() throttle-groups: drain before detaching ThrottleState block: all I/O should be completed before removing throttle timers. Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-11-14 16:11:19 +00:00
Alberto Garcia	c89bcf3af0	block: Leave valid throttle timers when removing a BDS from a backend If a BlockBackend has I/O limits set then its ThrottleGroupMember structure uses the AioContext from its attached BlockDriverState. Those two contexts must be kept in sync manually. This is not ideal and will be fixed in the future by removing the throttling configuration from the BlockBackend and storing it in an implicit filter node instead, but for now we have to live with this. When you remove the BlockDriverState from the backend then the throttle timers are destroyed. If a new BlockDriverState is later inserted then they are created again using the new AioContext. There are a couple of problems with this: a) The code manipulates the timers directly, leaving the ThrottleGroupMember.aio_context field in an inconsisent state. b) If you remove the I/O limits (e.g by destroying the backend) when the timers are gone then throttle_group_unregister_tgm() will attempt to destroy them again, crashing QEMU. While b) could be fixed easily by allowing the timers to be freed twice, this would result in a situation in which we can no longer guarantee that a valid ThrottleState has a valid AioContext and timers. This patch ensures that the timers and AioContext are always valid when I/O limits are set, regardless of whether the BlockBackend has a BlockDriverState inserted or not. [Fixed "There'a" typo as suggested by Max Reitz <mreitz@redhat.com> --Stefan] Reported-by: sochin jiang <sochin.jiang@huawei.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: e089c66e7c20289b046d782cea4373b765c5bc1d.1510339534.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-11-13 15:43:49 +00:00
Alberto Garcia	48bf7ea81a	block: Check for inserted BlockDriverState in blk_io_limits_disable() When you set I/O limits using block_set_io_throttle or the command line throttling.* options they are kept in the BlockBackend regardless of whether a BlockDriverState is attached to the backend or not. Therefore when removing the limits using blk_io_limits_disable() we need to check if there's a BDS before attempting to drain it, else it will crash QEMU. This can be reproduced very easily using HMP: (qemu) drive_add 0 if=none,throttling.iops-total=5000 (qemu) drive_del none0 Reported-by: sochin jiang <sochin.jiang@huawei.com> Signed-off-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 0d3a67ce8d948bb33e08672564714dcfb76a3d8c.1510339534.git.berto@igalia.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-11-13 14:38:46 +00:00
Stefan Hajnoczi	dc868fb03b	throttle-groups: drain before detaching ThrottleState I/O requests hang after stop/cont commands at least since QEMU 2.10.0 with -drive iops=100: (guest)$ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000 (qemu) stop (qemu) cont ...I/O is stuck... This happens because blk_set_aio_context() detaches the ThrottleState while requests may still be in flight: if (tgm->throttle_state) { throttle_group_detach_aio_context(tgm); throttle_group_attach_aio_context(tgm, new_context); } This patch encloses the detach/attach calls in a drained region so no I/O request is left hanging. Also add assertions so we don't make the same mistake again in the future. Reported-by: Yongxue Hong <yhong@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20171110151934.16883-1-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-11-13 14:02:09 +00:00
Zhengui	632a773543	block: all I/O should be completed before removing throttle timers. In blk_remove_bs, all I/O should be completed before removing throttle timers. If there has inflight I/O, removing throttle timers here will cause the inflight I/O never return. This patch add bdrv_drained_begin before throttle_timers_detach_aio_context to let all I/O completed before removing throttle timers. [Moved declaration of bs as suggested by Alberto Garcia <berto@igalia.com>. --Stefan] Signed-off-by: Zhengui <lizhengui@huawei.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 1508564040-120700-1-git-send-email-lizhengui@huawei.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-11-13 14:02:05 +00:00
Eric Blake	b4176cb314	nbd-client: Stricter enforcing of structured reply spec Ensure that the server is not sending unexpected chunk lengths for either the NONE or the OFFSET_DATA chunk, nor unexpected hole length for OFFSET_HOLE. This will flag any server as broken that responds to a zero-length read with an OFFSET_DATA (what our server currently does, but that's about to be fixed) or with OFFSET_HOLE, even though we previously fixed our client to never be able to send such a request over the wire. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171108215703.9295-7-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2017-11-09 10:22:26 -06:00
Eric Blake	9d8f818cde	nbd-client: Short-circuit 0-length operations The NBD spec was recently clarified to state that clients should not send 0-length requests to the server, as the server behavior is undefined [1]. We know that qemu-nbd's behavior is a successful no-op (once it has filtered for read-only exports), but other NBD implementations might return an error. To avoid any questionable server implementations, it is better to just short-circuit such requests on the client side (we are relying on the block layer to already filter out requests such as invalid offset, write to a read-only volume, and so forth); do the short-circuit as late as possible to still benefit from protections from assertions that the block layer is not violating our assumptions. [1] https://github.com/NetworkBlockDevice/nbd/commit/ee926037 Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171108215703.9295-6-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2017-11-09 10:18:31 -06:00
Eric Blake	1104d83c72	nbd-client: Refuse read-only client with BDRV_O_RDWR The NBD spec says that clients should not try to write/trim to an export advertised as read-only by the server. But we failed to check that, and would allow the block layer to use NBD with BDRV_O_RDWR even when the server is read-only, which meant we were depending on the server sending a proper EPERM failure for various commands, and also exposes a leaky abstraction: using qemu-io in read-write mode would succeed on 'w -z 0 0' because of local short-circuiting logic, but 'w 0 0' would send a request over the wire (where it then depends on the server, and fails at least for qemu-nbd but might pass for other NBD implementations). With this patch, a client MUST request read-only mode to access a server that is doing a read-only export, or else it will get a message like: can't open device nbd://localhost:10809/foo: request for write access conflicts with read-only export It is no longer possible to even attempt writes over the wire (including the corner case of 0-length writes), because the block layer enforces the explicit read-only request; this matches the behavior of qcow2 when backed by a read-only POSIX file. Fix several iotests to comply with the new behavior (since qemu-nbd of an internal snapshot, as well as nbd-server-add over QMP, default to a read-only export, we must tell blockdev-add/qemu-io to set up a read-only client). CC: qemu-stable@nongnu.org Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171108215703.9295-3-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2017-11-09 10:10:17 -06:00
Eric Blake	e659fb3b99	nbd-client: Fix error message typos Provide missing spaces that are required when using string concatenation to break error messages across source lines. Introduced in commit `f140e300`. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171108215703.9295-2-eblake@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>	2017-11-09 10:09:38 -06:00
Vladimir Sementsov-Ogievskiy	f140e30003	nbd: Minimal structured read for client Minimal implementation: for structured error only error_report error message. Note that test 83 is now more verbose, because the implementation prints more warnings about unexpected communication errors; perhaps future patches should tone things down by using trace messages instead of traces, but the common case of successful communication is no noisier than before. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171027104037.8319-13-eblake@redhat.com>	2017-10-30 21:48:41 +01:00
Vladimir Sementsov-Ogievskiy	d2febedb45	nbd/client: prepare nbd_receive_reply for structured reply In following patch nbd_receive_reply will be used both for simple and structured reply header receiving. NBDReply is altered into union of simple reply header and structured reply chunk header, simple error translation moved to block/nbd-client to be consistent with further structured reply error translation. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <20171027104037.8319-11-eblake@redhat.com>	2017-10-30 21:48:32 +01:00
Max Reitz	572b07bea1	qcow2: Always execute preallocate() in a coroutine Some qcow2 functions (at least perform_cow()) expect s->lock to be taken. Therefore, if we want to make use of them, we should execute preallocate() (as "preallocate_co") in a coroutine so that we can use the qemu_co_mutex_* functions. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171009215533.12530-3-mreitz@redhat.com Cc: qemu-stable@nongnu.org Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-26 15:01:14 +02:00
Max Reitz	e400ad1e1f	qcow2: Fix unaligned preallocated truncation A qcow2 image file's length is not required to have a length that is a multiple of the cluster size. However, qcow2_refcount_area() expects an aligned value for its @start_offset parameter, so we need to round @old_file_size up to the next cluster boundary. Reported-by: Ping Li <pingl@redhat.com> Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1414049 Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171009215533.12530-2-mreitz@redhat.com Cc: qemu-stable@nongnu.org Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-26 15:01:14 +02:00
Max Reitz	233521b199	qcow2: Emit errp when truncating the image tail bdrv_truncate() has an errp parameter which is always set when an error occurs. Let's use that instead of a plain strerror(). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20171009155431.14093-1-mreitz@redhat.com Reviewed-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-26 15:01:14 +02:00
Alberto Garcia	a35f87f50d	qcow2: Use BDRV_SECTOR_BITS instead of its literal value BDRV_SECTOR_BITS is defined to be 9 in block.h (and BDRV_SECTOR_SIZE is calculated from that), but there are still a couple of places where we are using the literal value instead of the macro. Signed-off-by: Alberto Garcia <berto@igalia.com> Message-id: 20171009153856.20387-1-berto@igalia.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-26 15:01:13 +02:00
Eric Blake	8cbf74b23c	qcow2: Reduce is_zero() rounding Now that bdrv_is_allocated accepts non-aligned inputs, we can remove the TODO added in earlier refactoring. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	88e63df214	block: Reduce bdrv_aligned_preadv() rounding Now that bdrv_is_allocated accepts non-aligned inputs, we can remove the TODO added in commit `d6a644bb`. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	efa6e2ed64	block: Align block status requests Any device that has request_alignment greater than 512 should be unable to report status at a finer granularity; it may also be simpler for such devices to be guaranteed that the block layer has rounded things out to the granularity boundary (the way the block layer already rounds all other I/O out). Besides, getting the code correct for super-sector alignment also benefits us for the fact that our public interface now has byte granularity, even though none of our drivers have byte-level callbacks. Add an assertion in blkdebug that proves that the block layer never requests status of unaligned sections, similar to what it does on other requests (while still keeping the generic helper in place for when future patches add a throttle driver). Note that iotest 177 already covers this (it would fail if you use just the blkdebug.c hunk without the io.c changes). Meanwhile, we can drop assertions in callers that no longer have to pass in sector-aligned addresses. There is a mid-function scope added for 'count' and 'longret', for a couple of reasons: first, an upcoming patch will add an 'if' statement that checks whether a driver has an old- or new-style callback, and can conveniently use the same scope for less indentation churn at that time. Second, since we are trying to get rid of sector-based computations, wrapping things in a scope makes it easier to group and see what will be deleted in a final cleanup patch once all drivers have been converted to the new-style callback. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	3182664220	block: Convert bdrv_get_block_status_above() to bytes We are gradually moving away from sector-based interfaces, towards byte-based. In the common case, allocation is unlikely to ever use values that are not naturally sector-aligned, but it is possible that byte-based values will let us be more precise about allocation at the end of an unaligned file that can do byte-based access. Changing the name of the function from bdrv_get_block_status_above() to bdrv_block_status_above() ensures that the compiler enforces that all callers are updated. Likewise, since it a byte interface allows an offset mapping that might not be sector aligned, split the mapping out of the return value and into a pass-by-reference parameter. For now, the io.c layer still assert()s that all uses are sector-aligned, but that can be relaxed when a later patch implements byte-based block status in the drivers. For the most part this patch is just the addition of scaling at the callers followed by inverse scaling at bdrv_block_status(), plus updates for the new split return interface. But some code, particularly bdrv_block_status(), gets a lot simpler because it no longer has to mess with sectors. Likewise, mirror code no longer computes s->granularity >> BDRV_SECTOR_BITS, and can therefore drop an assertion about alignment because the loop no longer depends on alignment (never mind that we don't really have a driver that reports sub-sector alignments, so it's not really possible to test the effect of sub-sector mirroring). Fix a neighboring assertion to use is_power_of_2 while there. For ease of review, bdrv_get_block_status() was tackled separately. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	5b648c67e3	block: Switch bdrv_co_get_block_status_above() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal type (no semantic change), and rename it to match the corresponding public function rename. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	7ddb99b9dc	block: Switch bdrv_common_block_status_above() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal function (no semantic change). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	4bcd936e47	block: Switch BdrvCoGetBlockStatusData to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal type (no semantic change), and rename it to match the corresponding public function rename. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	2e8bc7874b	block: Switch bdrv_co_get_block_status() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal function (no semantic change); and as with its public counterpart, rename to bdrv_co_block_status() and split the offset return, to make the compiler enforce that we catch all uses. For now, we assert that callers and the return value still use aligned data, but ultimately, this will be the function where we hand off to a byte-based driver callback, and will eventually need to add logic to ensure we round calls according to the driver's request_alignment then touch up the result handed back to the caller, to start permitting a caller to pass unaligned offsets. Note that we are now prepared to accepts 'bytes' larger than INT_MAX; this is okay as long as we clamp things internally before violating any 32-bit limits, and makes no difference to how a client will use the information (clients looping over the entire file must already be prepared for consecutive calls to return the same status, as drivers are already free to return shorter-than-maximal status due to any other convenient split points, such as when the L2 table crosses cluster boundaries in qcow2). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	237d78f8fc	block: Convert bdrv_get_block_status() to bytes We are gradually moving away from sector-based interfaces, towards byte-based. In the common case, allocation is unlikely to ever use values that are not naturally sector-aligned, but it is possible that byte-based values will let us be more precise about allocation at the end of an unaligned file that can do byte-based access. Changing the name of the function from bdrv_get_block_status() to bdrv_block_status() ensures that the compiler enforces that all callers are updated. For now, the io.c layer still assert()s that all callers are sector-aligned, but that can be relaxed when a later patch implements byte-based block status in the drivers. There was an inherent limitation in returning the offset via the return value: we only have room for BDRV_BLOCK_OFFSET_MASK bits, which means an offset can only be mapped for sector-aligned queries (or, if we declare that non-aligned input is at the same relative position modulo 512 of the answer), so the new interface also changes things to return the offset via output through a parameter by reference rather than mashed into the return value. We'll have some glue code that munges between the two styles until we finish converting all uses. For the most part this patch is just the addition of scaling at the callers followed by inverse scaling at bdrv_block_status(), coupled with the tweak in calling convention. But some code, particularly bdrv_is_allocated(), gets a lot simpler because it no longer has to mess with sectors. For ease of review, bdrv_get_block_status_above() will be tackled separately. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	7286d6106f	block: Switch bdrv_make_zero() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Change the internal loop iteration of zeroing a device to track by bytes instead of sectors (although we are still guaranteed that we iterate by steps that are sector-aligned). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	f06f6b66c7	qcow2: Switch is_zero_sectors() to byte-based We are gradually converting to byte-based interfaces, as they are easier to reason about than sector-based. Convert another internal function (no semantic change), and rename it to is_zero() in the process. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	7cfd527525	block: Make bdrv_round_to_clusters() signature more useful In the process of converting sector-based interfaces to bytes, I'm finding it easier to represent a byte count as a 64-bit integer at the block layer (even if we are internally capped by SIZE_MAX or even INT_MAX for individual transactions, it's still nicer to not have to worry about truncation/overflow issues on as many variables). Update the signature of bdrv_round_to_clusters() to uniformly use int64_t, matching the signature already chosen for bdrv_is_allocated and the fact that off_t is also a signed type, then adjust clients according to the required fallout (even where the result could now exceed 32 bits, no client is directly assigning the result into a 32-bit value without breaking things into a loop first). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	c9ce8c4da6	block: Add flag to avoid wasted work in bdrv_is_allocated() Not all callers care about which BDS owns the mapping for a given range of the file, or where the zeroes lie within that mapping. In particular, bdrv_is_allocated() cares more about finding the largest run of allocated data from the guest perspective, whether or not that data is consecutive from the host perspective, and whether or not the data reads as zero. Therefore, doing subsequent refinements such as checking how much of the format-layer allocation also satisfies BDRV_BLOCK_ZERO at the protocol layer is wasted work - in the best case, it just costs extra CPU cycles during a single bdrv_is_allocated(), but in the worst case, it results in a smaller *pnum, and forces callers to iterate through more status probes when visiting the entire file for even more extra CPU cycles. This patch only optimizes the block layer (no behavior change when want_zero is true, but skip unnecessary effort when it is false). Then when subsequent patches tweak the driver callback to be byte-based, we can also pass this hint through to the driver. Tweak BdrvCoGetBlockStatusData to declare arguments in parameter order, rather than mixing things up (minimizing padding is not necessary here). Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Eric Blake	298a1665a2	block: Allow NULL file for bdrv_get_block_status() Not all callers care about which BDS owns the mapping for a given range of the file. This patch merely simplifies the callers by consolidating the logic in the common call point, while guaranteeing a non-NULL file to all the driver callbacks, for no semantic change. The only caller that does not care about pnum is bdrv_is_allocated, as invoked by vvfat; we can likewise add assertions that the rest of the stack does not have to worry about a NULL pnum. Furthermore, this will also set the stage for a future cleanup: when a caller does not care about which BDS owns an offset, it would be nice to allow the driver to optimize things to not have to return BDRV_BLOCK_OFFSET_VALID in the first place. In the case of fragmented allocation (for example, it's fairly easy to create a qcow2 image where consecutive guest addresses are not at consecutive host addresses), the current contract requires bdrv_get_block_status() to clamp pnum to the limit where host addresses are no longer consecutive, but allowing a NULL file means that pnum could be set to the full length of known-allocated data. Signed-off-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2017-10-26 14:45:57 +02:00
Peter Maydell	79b2a13aa8	nbd patches for 2017-10-14 - Marc-André Lureau - NBD: use g_new() family of functions - Vladimir Sementsov-Ogievskiy - first half of 00/13 nbd minimal structured read -----BEGIN PGP SIGNATURE----- Comment: Public key at http://people.redhat.com/eblake/eblake.gpg iQEcBAABCAAGBQJZ4q4XAAoJEKeha0olJ0NqEooH/R8NKYACELA39xrLdEMUQuZY 1Lm3/OtpBIICKx7OiZ7LniqApAI++FgjNxOf6PAfNG0TmEA+wMFaZ6NJEdi9DAmv kJVLsxiqKLDD+WIKMq5XfZQoFMJ8rV8W2/BYx9cF3Pl4KMT20qDsumsncZJ7DGOR jjsbAI8Q6g45VBx6TJbxXiTMDj87nIyNaydAGzRQTmEHtnmh8mllPiuEhJu24l6G 7CQKfcu4/7Te/5PvJIPn7CxHdVjLYalgWDRkU3kXcwmO8vGQEkYoiHPoc8lGsGtw oXJ2YIODYBIjeICkF0/PjT9aoeJQG8EuHR1hT0CW5dVBZz/DlVP/j+EZ6IDV/8k= =ud0Z -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2017-10-14' into staging nbd patches for 2017-10-14 - Marc-André Lureau - NBD: use g_new() family of functions - Vladimir Sementsov-Ogievskiy - first half of 00/13 nbd minimal structured read # gpg: Signature made Sun 15 Oct 2017 01:38:47 BST # gpg: using RSA key 0xA7A16B4A2527436A # gpg: Good signature from "Eric Blake <eblake@redhat.com>" # gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" # gpg: aka "[jpeg image of size 6874]" # Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A * remotes/ericb/tags/pull-nbd-2017-10-14: nbd: header constants indenting nbd/server: simplify reply transmission nbd/server: refactor nbd_co_send_simple_reply parameters nbd/server: do not use NBDReply structure nbd/server: structurize simple reply header sending nbd: rename some simple-request related objects to be _simple_ block/nbd-client: refactor nbd_co_receive_reply block/nbd-client: assert qiov len once in nbd_co_request NBD: use g_new() family of functions Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2017-10-16 15:54:42 +01:00
Manos Pitsidianakis	b867eaa17b	block/throttle.c: add bdrv_co_drain_begin/end callbacks Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-10-13 12:38:41 +01:00
Manos Pitsidianakis	f8ea8dacf0	block: rename bdrv_co_drain to bdrv_co_drain_begin Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-10-13 12:38:41 +01:00
Manos Pitsidianakis	481cad48e5	block: add bdrv_co_drain_end callback BlockDriverState has a bdrv_co_drain() callback but no equivalent for the end of the drain. The throttle driver (block/throttle.c) needs a way to mark the end of the drain in order to toggle io_limits_disabled correctly, thus bdrv_co_drain_end is needed. Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-10-13 12:38:41 +01:00
Vladimir Sementsov-Ogievskiy	ed397b2fe7	block/nbd-client: refactor nbd_co_receive_reply Pass handle parameter directly, as the whole request isn't needed. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20171012095319.136610-3-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-10-12 16:20:27 -05:00
Vladimir Sementsov-Ogievskiy	4bfe4478d1	block/nbd-client: assert qiov len once in nbd_co_request Also improve the assertion: check that qiov is NULL for other commands than CMD_READ and CMD_WRITE. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-Id: <20171012095319.136610-2-vsementsov@virtuozzo.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2017-10-12 16:19:35 -05:00
Eduardo Habkost	e5766d6ec7	config: qemu_config_parse() return number of config groups Change qemu_config_parse() to return the number of config groups in success and -EINVAL on error. This will allow callers of qemu_config_parse() to check if something was really loaded from the config file. All existing callers of qemu_config_parse() and qemu_read_config_file() only check if the return value was negative, so the change shouldn't affect them. Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20171004025043.3788-2-ehabkost@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2017-10-09 23:21:52 -03:00
Vladimir Sementsov-Ogievskiy	ce960aa906	block/mirror: check backing in bdrv_mirror_top_flush Backing may be zero after failed bdrv_append in mirror_start_job, which leads to SIGSEGV. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20170929152255.5431-1-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-06 16:30:48 +02:00
Pavel Butsykin	163bc39d2c	qcow2: truncate the tail of the image file after shrinking the image Now after shrinking the image, at the end of the image file, there might be a tail that probably will never be used. So we can find the last used cluster and cut the tail. Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Reviewed-by: John Snow <jsnow@redhat.com> Message-id: 20170929121613.25997-3-pbutsykin@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-06 16:30:48 +02:00
Pavel Butsykin	76a2a30a99	qcow2: fix return error code in qcow2_truncate() Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20170929121613.25997-2-pbutsykin@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-06 16:30:48 +02:00
Vladimir Sementsov-Ogievskiy	18775ff326	block/mirror: check backing in bdrv_mirror_top_refresh_filename Backing may be zero after failed bdrv_attach_child in bdrv_set_backing_hd, which leads to SIGSEGV. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20170928120300.58164-1-vsementsov@virtuozzo.com Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-06 16:30:47 +02:00
Daniel P. Berrange	d67a6b09b4	block: support passthrough of BDRV_REQ_FUA in crypto driver The BDRV_REQ_FUA flag can trivially be allowed in the crypt driver as a passthrough to the underlying block driver. Reviewed-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170927125340.12360-7-berrange@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-06 16:30:47 +02:00
Daniel P. Berrange	4609742a49	block: convert qcrypto_block_encrypt\|decrypt to take bytes offset Instead of sector offset, take the bytes offset when encrypting or decrypting data. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170927125340.12360-6-berrange@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-06 16:30:47 +02:00
Daniel P. Berrange	a73466fbad	block: convert crypto driver to bdrv_co_preadv\|pwritev Make the crypto driver implement the bdrv_co_preadv\|pwritev callbacks, and also use bdrv_co_preadv\|pwritev for I/O with the protocol driver beneath. This replaces sector based I/O with byte based I/O, and allows us to stop assuming the physical sector size matches the encryption sector size. Signed-off-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170927125340.12360-5-berrange@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2017-10-06 16:30:47 +02:00

1 2 3 4 5 ...

3461 Commits