qemu-e2k/block
Eric Blake cb2e28780c block: Perform copy-on-read in loop
Improve our braindead copy-on-read implementation.  Pre-patch,
we have multiple issues:
- we create a bounce buffer and perform a write for the entire
request, even if the active image already has 99% of the
clusters occupied, and really only needs to copy-on-read the
remaining 1% of the clusters
- our bounce buffer was as large as the read request, and can
needlessly exhaust our memory by using double the memory of
the request size (the original request plus our bounce buffer),
rather than a capped maximum overhead beyond the original
- if a driver has a max_transfer limit, we are bypassing the
normal code in bdrv_aligned_preadv() that fragments to that
limit, and instead attempt to read the entire buffer from the
driver in one go, which some drivers may assert on
- a client can request a large request of nearly 2G such that
rounding the request out to cluster boundaries results in a
byte count larger than 2G.  While this cannot exceed 32 bits,
it DOES have some follow-on problems:
-- the call to bdrv_driver_pread() can assert for exceeding
BDRV_REQUEST_MAX_BYTES, if the driver is old and lacks
.bdrv_co_preadv
-- if the buffer is all zeroes, the subsequent call to
bdrv_co_do_pwrite_zeroes is a no-op due to a negative size,
which means we did not actually copy on read

Fix all of these issues by breaking up the action into a loop,
where each iteration is capped to sane limits.  Also, querying
the allocation status allows us to optimize: when data is
already present in the active layer, we don't need to bounce.

Note that the code has a telling comment that copy-on-read
should probably be a filter driver rather than a bolt-on hack
in io.c; but that remains a task for another day.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-10-06 16:28:58 +02:00
..
Makefile.objs block: add throttle block filter driver 2017-09-06 10:12:02 +02:00
accounting.c block: make accounting thread-safe 2017-06-16 07:55:00 +08:00
backup.c dirty-bitmap: Change bdrv_dirty_iter_next() to report byte offset 2017-10-06 16:28:58 +02:00
blkdebug.c block: add default implementations for bdrv_co_get_block_status() 2017-09-04 18:31:13 +02:00
blkreplay.c block: change variable names in BlockDriverState 2017-06-26 14:54:46 +02:00
blkverify.c blkverify: Catch bs->exact_filename overflow 2017-06-26 14:54:46 +02:00
block-backend.c block: tidy ThrottleGroupMember initializations 2017-09-05 16:47:52 +02:00
bochs.c block: do not set BDS read_only if copy_on_read enabled 2017-04-24 15:09:33 -04:00
cloop.c block: do not set BDS read_only if copy_on_read enabled 2017-04-24 15:09:33 -04:00
commit.c commit: Remove overlay_bs 2017-10-06 16:28:58 +02:00
crypto.c block: Add PreallocMode to bdrv_truncate() 2017-07-11 17:45:01 +02:00
crypto.h qcow: convert QCow to use QCryptoBlock for encryption 2017-07-11 17:44:56 +02:00
curl.c curl: do not do aio_poll when waiting for a free CURLState 2017-05-16 10:34:50 -04:00
dirty-bitmap.c dirty-bitmap: Convert internal hbitmap size/granularity 2017-10-06 16:28:58 +02:00
dmg-bz2.c dmg: Move libbz2 code to dmg-bz2.so 2016-10-07 14:14:06 +02:00
dmg.c dmg: use DIV_ROUND_UP 2017-08-31 12:29:07 +02:00
dmg.h dmg: Move libbz2 code to dmg-bz2.so 2016-10-07 14:14:06 +02:00
file-posix.c file-posix: Clear out first sector in hdev_create 2017-09-26 14:46:23 +02:00
file-win32.c qapi: Change data type of the FOO_lookup generated for enum FOO 2017-09-04 13:09:13 +02:00
gluster.c qapi: Change data type of the FOO_lookup generated for enum FOO 2017-09-04 13:09:13 +02:00
io.c block: Perform copy-on-read in loop 2017-10-06 16:28:58 +02:00
iscsi-opts.c block/iscsi: statically link qemu_iscsi_opts 2017-01-27 18:07:58 +01:00
iscsi.c scsi: move block/scsi.h to include/scsi/constants.h 2017-09-19 14:09:31 +02:00
linux-aio.c block: explicitly acquire aiocontext in aio callbacks that need it 2017-02-21 11:39:39 +00:00
mirror.c mirror: Switch mirror_dirty_init() to byte-based iteration 2017-10-06 16:28:58 +02:00
nbd-client.c block/nbd-client: nbd_co_send_request: fix return code 2017-09-25 08:21:26 -05:00
nbd-client.h nbd-client: avoid spurious qio_channel_yield() re-entry 2017-08-23 11:22:15 -05:00
nbd.c nbd: Implement NBD_INFO_BLOCK_SIZE on client 2017-07-14 12:04:42 +02:00
nfs.c qapi: Mechanically convert FOO_lookup[...] to FOO_str(...) 2017-09-04 13:09:13 +02:00
null.c block/null: Remove 'filename' option 2017-08-08 15:19:16 +02:00
parallels.c qapi: drop the sentinel in enum array 2017-09-04 13:09:13 +02:00
qapi.c block: move ThrottleGroup membership to ThrottleGroupMember 2017-09-05 16:47:51 +02:00
qcow.c qcow: Check failure of bdrv_getlength() and bdrv_truncate() 2017-09-04 18:33:00 +02:00
qcow2-bitmap.c qcow2: Switch store_bitmap_data() to byte-based iteration 2017-10-06 16:28:58 +02:00
qcow2-cache.c qcow2: add qcow2_cache_discard 2017-09-26 15:00:32 +02:00
qcow2-cluster.c qcow2: add shrink image support 2017-09-26 15:00:32 +02:00
qcow2-refcount.c qcow2: add shrink image support 2017-09-26 15:00:32 +02:00
qcow2-snapshot.c qcow2: Discard/zero clusters by byte count 2017-05-11 14:28:07 +02:00
qcow2.c qcow2: Switch qcow2_measure() to byte-based iteration 2017-10-06 16:28:58 +02:00
qcow2.h qcow2: add shrink image support 2017-09-26 15:00:32 +02:00
qed-check.c qed: Use DIV_ROUND_UP 2016-06-07 18:19:24 +03:00
qed-cluster.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed-l2-cache.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed-table.c qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
qed.c qapi: Mechanically convert FOO_lookup[...] to FOO_str(...) 2017-09-04 13:09:13 +02:00
qed.h qed: protect table cache with CoMutex 2017-07-17 11:34:11 +08:00
quorum.c qapi: Change data type of the FOO_lookup generated for enum FOO 2017-09-04 13:09:13 +02:00
raw-format.c block: remove unused bdrv_media_changed 2017-09-04 18:31:13 +02:00
rbd.c qapi: Mechanically convert FOO_lookup[...] to FOO_str(...) 2017-09-04 13:09:13 +02:00
replication.c block: Add reopen_queue to bdrv_child_perm() 2017-09-26 14:46:23 +02:00
sheepdog.c Merge QEMU I/O 2017/09/05 v2 2017-09-05 14:14:33 +01:00
snapshot.c qobject: Use simpler QDict/QList scalar insertion macros 2017-05-09 09:13:51 +02:00
ssh.c util: remove the obsolete non-blocking connect 2017-09-05 13:21:58 +01:00
stream.c block: Make bdrv_is_allocated_above() byte-based 2017-07-10 13:18:07 +02:00
throttle-groups.c block/throttle-groups.c: allocate RestartData on the heap 2017-09-26 14:46:23 +02:00
throttle.c block: add throttle block filter driver 2017-09-06 10:12:02 +02:00
trace-events block: move trace probes into bdrv_co_preadv|pwritev 2017-08-07 09:39:35 +01:00
vdi.c vdi: make it thread-safe 2017-07-17 11:28:15 +08:00
vhdx-endian.c vhdx: Use QEMU UUID API 2016-09-23 11:42:52 +08:00
vhdx-log.c vhdx: use QEMU_ALIGN_DOWN 2017-08-31 12:29:07 +02:00
vhdx.c block/vhdx: check for offset overflow to bdrv_truncate() 2017-08-08 14:37:00 +02:00
vhdx.h block: vhdx - update PAYLOAD_BLOCK_UNMAPPED value to match 1.00 spec 2014-12-12 15:42:22 +00:00
vmdk.c vmdk: Fix error handling/reporting of vmdk_check 2017-08-08 15:19:16 +02:00
vpc.c vpc: use DIV_ROUND_UP 2017-08-31 12:29:07 +02:00
vvfat.c block: Add reopen_queue to bdrv_child_perm() 2017-09-26 14:46:23 +02:00
vxhs.c qobject: Use simpler QDict/QList scalar insertion macros 2017-05-09 09:13:51 +02:00
win32-aio.c block: explicitly acquire aiocontext in aio callbacks that need it 2017-02-21 11:39:39 +00:00
write-threshold.c block: use bdrv_add_before_write_notifier 2016-10-07 13:34:07 +02:00