qemu-e2k/block
Changlong Xie 9bc9732fae nbd: Use CoQueue for free_sema instead of CoMutex
NBD is using the CoMutex in a way that wasn't anticipated. For example, if there are
N(N=26, MAX_NBD_REQUESTS=16) nbd write requests, so we will invoke nbd_client_co_pwritev
N times.
----------------------------------------------------------------------------------------
time request Actions
1    1       in_flight=1, Coroutine=C1
2    2       in_flight=2, Coroutine=C2
...
15   15      in_flight=15, Coroutine=C15
16   16      in_flight=16, Coroutine=C16, free_sema->holder=C16, mutex->locked=true
17   17      in_flight=16, Coroutine=C17, queue C17 into free_sema->queue
18   18      in_flight=16, Coroutine=C18, queue C18 into free_sema->queue
...
26   N       in_flight=16, Coroutine=C26, queue C26 into free_sema->queue
----------------------------------------------------------------------------------------

Once nbd client recieves request No.16' reply, we will re-enter C16. It's ok, because
it's equal to 'free_sema->holder'.
----------------------------------------------------------------------------------------
time request Actions
27   16      in_flight=15, Coroutine=C16, free_sema->holder=C16, mutex->locked=false
----------------------------------------------------------------------------------------

Then nbd_coroutine_end invokes qemu_co_mutex_unlock what will pop coroutines from
free_sema->queue's head and enter C17. More free_sema->holder is C17 now.
----------------------------------------------------------------------------------------
time request Actions
28   17      in_flight=16, Coroutine=C17, free_sema->holder=C17, mutex->locked=true
----------------------------------------------------------------------------------------

In above scenario, we only recieves request No.16' reply. As time goes by, nbd client will
almostly recieves replies from requests 1 to 15 rather than request 17 who owns C17. In this
case, we will encounter assert "mutex->holder == self" failed since Kevin's commit 0e438cdc
"coroutine: Let CoMutex remember who holds it". For example, if nbd client recieves request
No.15' reply, qemu will stop unexpectedly:
----------------------------------------------------------------------------------------
time request       Actions
29   15(most case) in_flight=15, Coroutine=C15, free_sema->holder=C17, mutex->locked=false
----------------------------------------------------------------------------------------

Per Paolo's suggestion "The simplest fix is to change it to CoQueue, which is like a condition
variable", this patch replaces CoMutex with CoQueue.

Cc: Wen Congyang <wency@cn.fujitsu.com>
Reported-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Message-Id: <1476267508-19499-1-git-send-email-xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-01 16:06:57 +01:00
..
accounting.c block: Clean up includes 2016-01-20 13:36:23 +01:00
archipelago.c block: use aio_bh_schedule_oneshot 2016-10-07 13:34:07 +02:00
backup.c block: Use block_job_add_bdrv() in backup_start() 2016-10-31 16:52:38 +01:00
blkdebug.c block: use aio_bh_schedule_oneshot 2016-10-07 13:34:07 +02:00
blkreplay.c replay: allow replay stopping and restarting 2016-09-27 11:57:30 +02:00
blkverify.c block: use aio_bh_schedule_oneshot 2016-10-07 13:34:07 +02:00
block-backend.c block: introduce BDRV_POLL_WHILE 2016-10-28 21:50:18 +08:00
bochs.c block: Convert bdrv_co_preadv/pwritev to BdrvChild 2016-07-05 16:46:27 +02:00
cloop.c block: Convert bdrv_pread(v) to BdrvChild 2016-07-05 16:46:27 +02:00
commit.c block: Block all nodes involved in the block-commit operation 2016-10-31 16:52:38 +01:00
crypto.c crypto: make PBKDF iterations configurable for LUKS format 2016-09-19 16:30:45 +01:00
curl.c block: use aio_bh_schedule_oneshot 2016-10-07 13:34:07 +02:00
dirty-bitmap.c block: More operations for meta dirty bitmap 2016-10-24 17:56:07 +02:00
dmg-bz2.c dmg: Move libbz2 code to dmg-bz2.so 2016-10-07 14:14:06 +02:00
dmg.c dmg: Move libbz2 code to dmg-bz2.so 2016-10-07 14:14:06 +02:00
dmg.h dmg: Move libbz2 code to dmg-bz2.so 2016-10-07 14:14:06 +02:00
gluster.c block: use aio_bh_schedule_oneshot 2016-10-07 13:34:07 +02:00
io.c block: Add bdrv_drain_all_{begin,end}() 2016-10-31 16:51:14 +01:00
iscsi.c block/iscsi: Adding new iSER transport layer option 2016-10-24 11:30:55 +02:00
linux-aio.c linux-aio: fix re-entrant completion processing 2016-09-28 17:11:23 +01:00
Makefile.objs dmg: Move libbz2 code to dmg-bz2.so 2016-10-07 14:14:06 +02:00
mirror.c block: Block all intermediate nodes in commit_active_start() 2016-10-31 16:52:38 +01:00
nbd-client.c nbd: Use CoQueue for free_sema instead of CoMutex 2016-11-01 16:06:57 +01:00
nbd-client.h nbd: Use CoQueue for free_sema instead of CoMutex 2016-11-01 16:06:57 +01:00
nbd.c Merge qio 2016/10/27 v1 2016-10-28 15:30:55 +01:00
nfs.c block/nfs: Introduce runtime_opts in NFS 2016-10-31 16:52:39 +01:00
null.c block: use aio_bh_schedule_oneshot 2016-10-07 13:34:07 +02:00
parallels.c block/parallels: check new image size 2016-08-05 09:59:06 +01:00
qapi.c qapi: rename QmpOutputVisitor to QObjectOutputVisitor 2016-10-25 16:25:54 +02:00
qcow2-cache.c block: Convert bdrv_pwrite(v/_sync) to BdrvChild 2016-07-05 16:46:27 +02:00
qcow2-cluster.c qcow2: Support BDRV_REQ_MAY_UNMAP 2016-10-24 17:54:03 +02:00
qcow2-refcount.c block: Convert bdrv_discard() to byte-based 2016-07-20 14:11:55 +01:00
qcow2-snapshot.c block: Convert bdrv_pwrite(v/_sync) to BdrvChild 2016-07-05 16:46:27 +02:00
qcow2.c qcow2: Support BDRV_REQ_MAY_UNMAP 2016-10-24 17:54:03 +02:00
qcow2.h qcow2: Support BDRV_REQ_MAY_UNMAP 2016-10-24 17:54:03 +02:00
qcow.c crypto: extend mode as a parameter in qcrypto_cipher_supports() 2016-10-19 10:09:24 +01:00
qed-check.c qed: Use DIV_ROUND_UP 2016-06-07 18:19:24 +03:00
qed-cluster.c block: Clean up includes 2016-01-20 13:36:23 +01:00
qed-gencb.c block: Clean up includes 2016-01-20 13:36:23 +01:00
qed-l2-cache.c block: Clean up includes 2016-01-20 13:36:23 +01:00
qed-table.c block: introduce BDRV_POLL_WHILE 2016-10-28 21:50:18 +08:00
qed.c qed: Implement .bdrv_drain 2016-10-28 21:50:18 +08:00
qed.h block: use aio_bh_schedule_oneshot 2016-10-07 13:34:07 +02:00
quorum.c quorum: do not allocate multiple iovecs for FIFO strategy 2016-10-24 17:56:06 +02:00
raw_bsd.c raw_bsd: add offset and size options 2016-10-31 16:52:39 +01:00
raw-posix.c raw-posix: Don't use bdrv_ioctl() 2016-10-27 19:05:23 +02:00
raw-win32.c block: improve error handling in raw_open 2016-10-24 17:54:03 +02:00
rbd.c rbd: shift byte count as a 64-bit value 2016-10-23 16:10:59 +02:00
replication.c block: prepare bdrv_reopen_multiple to release AioContext 2016-10-28 21:50:18 +08:00
sheepdog.c block: only call aio_poll on the current thread's AioContext 2016-10-28 21:50:18 +08:00
snapshot.c error: Remove NULL checks on error_propagate() calls 2016-06-20 16:38:13 +02:00
ssh.c block/ssh: Use InetSocketAddress options 2016-10-31 16:49:13 +01:00
stream.c block: Support streaming to an intermediate layer 2016-10-31 16:52:38 +01:00
throttle-groups.c throttle: Correct access to wrong BlockBackendPublic structures 2016-10-24 17:54:03 +02:00
trace-events block: Remove bdrv_aio_pdiscard() 2016-10-27 19:05:22 +02:00
vdi.c vdi: Use QEMU UUID API 2016-09-23 11:42:52 +08:00
vhdx-endian.c vhdx: Use QEMU UUID API 2016-09-23 11:42:52 +08:00
vhdx-log.c block: Convert bdrv_pwrite(v/_sync) to BdrvChild 2016-07-05 16:46:27 +02:00
vhdx.c vhdx: Use QEMU UUID API 2016-09-23 11:42:52 +08:00
vhdx.h block: vhdx - update PAYLOAD_BLOCK_UNMAPPED value to match 1.00 spec 2014-12-12 15:42:22 +00:00
vmdk.c vmdk: add vmdk_co_pwritev_compressed 2016-09-05 19:06:48 +02:00
vpc.c vpc: Use QEMU UUID API 2016-09-23 11:42:52 +08:00
vvfat.c block: Add "read-only" to the options QDict 2016-09-23 13:36:10 +02:00
win32-aio.c linux-aio: share one LinuxAioState within an AioContext 2016-07-18 15:09:31 +01:00
write-threshold.c block: use bdrv_add_before_write_notifier 2016-10-07 13:34:07 +02:00