qemu-e2k/block
Eric Blake b2f95feec5 block: Let write zeroes fallback work even with small max_transfer
Commit 443668ca rewrote the write_zeroes logic to guarantee that
an unaligned request never crosses a cluster boundary.  But
in the rewrite, the new code assumed that at most one iteration
would be needed to get to an alignment boundary.

However, it is easy to trigger an assertion failure: the Linux
kernel limits loopback devices to advertise a max_transfer of
only 64k.  Any operation that requires falling back to writes
rather than more efficient zeroing must obey max_transfer during
that fallback, which means an unaligned head may require multiple
iterations of the write fallbacks before reaching the aligned
boundaries, when layering a format with clusters larger than 64k
atop the protocol of file access to a loopback device.

Test case:

$ qemu-img create -f qcow2 -o cluster_size=1M file 10M
$ losetup /dev/loop2 /path/to/file
$ qemu-io -f qcow2 /dev/loop2
qemu-io> w 7m 1k
qemu-io> w -z 8003584 2093056

In fairness to Denis (as the original listed author of the culprit
commit), the faulty logic for at most one iteration is probably all
my fault in reworking his idea.  But the solution is to restore what
was in place prior to that commit: when dealing with an unaligned
head or tail, iterate as many times as necessary while fragmenting
the operation at max_transfer boundaries.

Reported-by: Ed Swierk <eswierk@skyportsystems.com>
CC: qemu-stable@nongnu.org
CC: Denis V. Lunev <den@openvz.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-11-22 15:59:22 +01:00
..
accounting.c block: Clean up includes 2016-01-20 13:36:23 +01:00
archipelago.c block: use aio_bh_schedule_oneshot 2016-10-07 13:34:07 +02:00
backup.c blockjob: refactor backup_start as backup_job_create 2016-11-14 22:47:34 -05:00
blkdebug.c block: use aio_bh_schedule_oneshot 2016-10-07 13:34:07 +02:00
blkreplay.c replay: allow replay stopping and restarting 2016-09-27 11:57:30 +02:00
blkverify.c block: use aio_bh_schedule_oneshot 2016-10-07 13:34:07 +02:00
block-backend.c block-backend: Always notify on blk_eject 2016-11-14 11:15:54 -05:00
bochs.c block: Convert bdrv_co_preadv/pwritev to BdrvChild 2016-07-05 16:46:27 +02:00
cloop.c block: Convert bdrv_pread(v) to BdrvChild 2016-07-05 16:46:27 +02:00
commit.c blockjob: add block_job_start 2016-11-14 22:47:34 -05:00
crypto.c crypto: make PBKDF iterations configurable for LUKS format 2016-09-19 16:30:45 +01:00
curl.c block/curl: Do not wait for data beyond EOF 2016-11-14 22:47:34 -05:00
dirty-bitmap.c block: More operations for meta dirty bitmap 2016-10-24 17:56:07 +02:00
dmg-bz2.c dmg: Move libbz2 code to dmg-bz2.so 2016-10-07 14:14:06 +02:00
dmg.c dmg: Move libbz2 code to dmg-bz2.so 2016-10-07 14:14:06 +02:00
dmg.h dmg: Move libbz2 code to dmg-bz2.so 2016-10-07 14:14:06 +02:00
gluster.c gluster: Fix use after free in glfs_clear_preopened() 2016-11-21 17:04:43 -05:00
io.c block: Let write zeroes fallback work even with small max_transfer 2016-11-22 15:59:22 +01:00
iscsi.c block/iscsi: Adding new iSER transport layer option 2016-10-24 11:30:55 +02:00
linux-aio.c linux-aio: fix re-entrant completion processing 2016-09-28 17:11:23 +01:00
Makefile.objs dmg: Move libbz2 code to dmg-bz2.so 2016-10-07 14:14:06 +02:00
mirror.c mirror: do not flush every time the disks are synced 2016-11-14 22:49:26 -05:00
nbd-client.c nbd: Implement NBD_CMD_WRITE_ZEROES on client 2016-11-02 09:28:56 +01:00
nbd-client.h nbd: Implement NBD_CMD_WRITE_ZEROES on client 2016-11-02 09:28:56 +01:00
nbd.c block/nbd: Fix the leaked visitor 2016-11-11 15:54:55 +01:00
nfs.c nfs: Fix memory leak in nfs_file_create() 2016-11-11 15:54:55 +01:00
null.c block: use aio_bh_schedule_oneshot 2016-10-07 13:34:07 +02:00
parallels.c block/parallels: check new image size 2016-08-05 09:59:06 +01:00
qapi.c qapi: rename QmpOutputVisitor to QObjectOutputVisitor 2016-10-25 16:25:54 +02:00
qcow2-cache.c block: Convert bdrv_pwrite(v/_sync) to BdrvChild 2016-07-05 16:46:27 +02:00
qcow2-cluster.c qcow2: Support BDRV_REQ_MAY_UNMAP 2016-10-24 17:54:03 +02:00
qcow2-refcount.c block: Convert bdrv_discard() to byte-based 2016-07-20 14:11:55 +01:00
qcow2-snapshot.c block: Convert bdrv_pwrite(v/_sync) to BdrvChild 2016-07-05 16:46:27 +02:00
qcow2.c qcow2: Inform block layer about discard boundaries 2016-11-22 15:59:22 +01:00
qcow2.h qcow2: Remove stale FIXME comment 2016-11-11 15:54:55 +01:00
qcow.c crypto: extend mode as a parameter in qcrypto_cipher_supports() 2016-10-19 10:09:24 +01:00
qed-check.c qed: Use DIV_ROUND_UP 2016-06-07 18:19:24 +03:00
qed-cluster.c block: Clean up includes 2016-01-20 13:36:23 +01:00
qed-gencb.c block: Clean up includes 2016-01-20 13:36:23 +01:00
qed-l2-cache.c block: Clean up includes 2016-01-20 13:36:23 +01:00
qed-table.c block: introduce BDRV_POLL_WHILE 2016-10-28 21:50:18 +08:00
qed.c qed: Implement .bdrv_drain 2016-10-28 21:50:18 +08:00
qed.h block: use aio_bh_schedule_oneshot 2016-10-07 13:34:07 +02:00
quorum.c quorum: do not allocate multiple iovecs for FIFO strategy 2016-10-24 17:56:06 +02:00
raw_bsd.c raw_bsd: don't check size alignment when only offset is set 2016-11-11 15:54:55 +01:00
raw-posix.c raw-posix: Rename 'raw_s' to 'rs' 2016-11-11 15:56:22 +01:00
raw-win32.c block: improve error handling in raw_open 2016-10-24 17:54:03 +02:00
rbd.c rbd: make the code more readable 2016-11-01 07:55:57 -04:00
replication.c blockjob: refactor backup_start as backup_job_create 2016-11-14 22:47:34 -05:00
sheepdog.c block: only call aio_poll on the current thread's AioContext 2016-10-28 21:50:18 +08:00
snapshot.c error: Remove NULL checks on error_propagate() calls 2016-06-20 16:38:13 +02:00
ssh.c block/ssh: Code cleanup for unused parameter 2016-11-11 15:54:55 +01:00
stream.c blockjob: add block_job_start 2016-11-14 22:47:34 -05:00
throttle-groups.c throttle: Correct access to wrong BlockBackendPublic structures 2016-10-24 17:54:03 +02:00
trace-events blockjob: add block_job_start 2016-11-14 22:47:34 -05:00
vdi.c vdi: Use QEMU UUID API 2016-09-23 11:42:52 +08:00
vhdx-endian.c vhdx: Use QEMU UUID API 2016-09-23 11:42:52 +08:00
vhdx-log.c block: Convert bdrv_pwrite(v/_sync) to BdrvChild 2016-07-05 16:46:27 +02:00
vhdx.c vhdx: Use QEMU UUID API 2016-09-23 11:42:52 +08:00
vhdx.h block: vhdx - update PAYLOAD_BLOCK_UNMAPPED value to match 1.00 spec 2014-12-12 15:42:22 +00:00
vmdk.c vmdk: add vmdk_co_pwritev_compressed 2016-09-05 19:06:48 +02:00
vpc.c vpc: Use QEMU UUID API 2016-09-23 11:42:52 +08:00
vvfat.c block: Add "read-only" to the options QDict 2016-09-23 13:36:10 +02:00
win32-aio.c linux-aio: share one LinuxAioState within an AioContext 2016-07-18 15:09:31 +01:00
write-threshold.c block: use bdrv_add_before_write_notifier 2016-10-07 13:34:07 +02:00