qemu-e2k

Author	SHA1	Message	Date
Sergio Lopez	c7040ff64e	block: Honor blk_set_aio_context() context requirements The documentation for bdrv_set_aio_context_ignore() states this: * The caller must own the AioContext lock for the old AioContext of bs, but it * must not own the AioContext lock for new_context (unless new_context is the * same as the current context of bs). As blk_set_aio_context() makes use of this function, this rule also applies to it. Fix all occurrences where this rule wasn't honored. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Sergio Lopez <slp@redhat.com> Message-Id: <20201214170519.223781-2-slp@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-01-20 14:48:08 -06:00
Sergio Lopez	49b44549ac	virtio-blk: On restart, process queued requests in the proper context On restart, we were scheduling a BH to process queued requests, which would run before starting up the data plane, leading to those requests being assigned and started on coroutines on the main context. This could cause requests to be wrongly processed in parallel from different threads (the main thread and the iothread managing the data plane), potentially leading to multiple issues. For example, stopping and resuming a VM multiple times while the guest is generating I/O on a virtio_blk device can trigger a crash with a stack tracing looking like this one: <------> Thread 2 (Thread 0x7ff736765700 (LWP 1062503)): #0 0x00005567a13b99d6 in iov_memset (iov=0x6563617073206f4e, iov_cnt=1717922848, offset=516096, fillc=0, bytes=7018105756081554803) at util/iov.c:69 #1 0x00005567a13bab73 in qemu_iovec_memset (qiov=0x7ff73ec99748, offset=516096, fillc=0, bytes=7018105756081554803) at util/iov.c:530 #2 0x00005567a12f411c in qemu_laio_process_completion (laiocb=0x7ff6512ee6c0) at block/linux-aio.c:86 #3 0x00005567a12f42ff in qemu_laio_process_completions (s=0x7ff7182e8420) at block/linux-aio.c:217 #4 0x00005567a12f480d in ioq_submit (s=0x7ff7182e8420) at block/linux-aio.c:323 #5 0x00005567a12f43d9 in qemu_laio_process_completions_and_submit (s=0x7ff7182e8420) at block/linux-aio.c:236 #6 0x00005567a12f44c2 in qemu_laio_poll_cb (opaque=0x7ff7182e8430) at block/linux-aio.c:267 #7 0x00005567a13aed83 in run_poll_handlers_once (ctx=0x5567a2b58c70, timeout=0x7ff7367645f8) at util/aio-posix.c:520 #8 0x00005567a13aee9f in run_poll_handlers (ctx=0x5567a2b58c70, max_ns=16000, timeout=0x7ff7367645f8) at util/aio-posix.c:562 #9 0x00005567a13aefde in try_poll_mode (ctx=0x5567a2b58c70, timeout=0x7ff7367645f8) at util/aio-posix.c:597 #10 0x00005567a13af115 in aio_poll (ctx=0x5567a2b58c70, blocking=true) at util/aio-posix.c:639 #11 0x00005567a109acca in iothread_run (opaque=0x5567a2b29760) at iothread.c:75 #12 0x00005567a13b2790 in qemu_thread_start (args=0x5567a2b694c0) at util/qemu-thread-posix.c:519 #13 0x00007ff73eedf2de in start_thread () at /lib64/libpthread.so.0 #14 0x00007ff73ec10e83 in clone () at /lib64/libc.so.6 Thread 1 (Thread 0x7ff743986f00 (LWP 1062500)): #0 0x00005567a13b99d6 in iov_memset (iov=0x6563617073206f4e, iov_cnt=1717922848, offset=516096, fillc=0, bytes=7018105756081554803) at util/iov.c:69 #1 0x00005567a13bab73 in qemu_iovec_memset (qiov=0x7ff73ec99748, offset=516096, fillc=0, bytes=7018105756081554803) at util/iov.c:530 #2 0x00005567a12f411c in qemu_laio_process_completion (laiocb=0x7ff6512ee6c0) at block/linux-aio.c:86 #3 0x00005567a12f42ff in qemu_laio_process_completions (s=0x7ff7182e8420) at block/linux-aio.c:217 #4 0x00005567a12f480d in ioq_submit (s=0x7ff7182e8420) at block/linux-aio.c:323 #5 0x00005567a12f4a2f in laio_do_submit (fd=19, laiocb=0x7ff5f4ff9ae0, offset=472363008, type=2) at block/linux-aio.c:375 #6 0x00005567a12f4af2 in laio_co_submit (bs=0x5567a2b8c460, s=0x7ff7182e8420, fd=19, offset=472363008, qiov=0x7ff5f4ff9ca0, type=2) at block/linux-aio.c:394 #7 0x00005567a12f1803 in raw_co_prw (bs=0x5567a2b8c460, offset=472363008, bytes=20480, qiov=0x7ff5f4ff9ca0, type=2) at block/file-posix.c:1892 #8 0x00005567a12f1941 in raw_co_pwritev (bs=0x5567a2b8c460, offset=472363008, bytes=20480, qiov=0x7ff5f4ff9ca0, flags=0) at block/file-posix.c:1925 #9 0x00005567a12fe3e1 in bdrv_driver_pwritev (bs=0x5567a2b8c460, offset=472363008, bytes=20480, qiov=0x7ff5f4ff9ca0, qiov_offset=0, flags=0) at block/io.c:1183 #10 0x00005567a1300340 in bdrv_aligned_pwritev (child=0x5567a2b5b070, req=0x7ff5f4ff9db0, offset=472363008, bytes=20480, align=512, qiov=0x7ff72c0425b8, qiov_offset=0, flags=0) at block/io.c:1980 #11 0x00005567a1300b29 in bdrv_co_pwritev_part (child=0x5567a2b5b070, offset=472363008, bytes=20480, qiov=0x7ff72c0425b8, qiov_offset=0, flags=0) at block/io.c:2137 #12 0x00005567a12baba1 in qcow2_co_pwritev_task (bs=0x5567a2b92740, file_cluster_offset=472317952, offset=487305216, bytes=20480, qiov=0x7ff72c0425b8, qiov_offset=0, l2meta=0x0) at block/qcow2.c:2444 #13 0x00005567a12bacdb in qcow2_co_pwritev_task_entry (task=0x5567a2b48540) at block/qcow2.c:2475 #14 0x00005567a13167d8 in aio_task_co (opaque=0x5567a2b48540) at block/aio_task.c:45 #15 0x00005567a13cf00c in coroutine_trampoline (i0=738245600, i1=32759) at util/coroutine-ucontext.c:115 #16 0x00007ff73eb622e0 in __start_context () at /lib64/libc.so.6 #17 0x00007ff6626f1350 in () #18 0x0000000000000000 in () <------> This is also known to cause crashes with this message (assertion failed): aio_co_schedule: Co-routine was already scheduled in 'aio_co_schedule' RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1812765 Signed-off-by: Sergio Lopez <slp@redhat.com> Message-Id: <20200603093240.40489-3-slp@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2020-06-17 14:53:39 +02:00
Li Hangjing	725fe5d10d	virtio-blk: fix out-of-bounds access to bitmap in notify_guest_bh When the number of a virtio-blk device's virtqueues is larger than BITS_PER_LONG, the out-of-bounds access to bitmap[ ] will occur. Fixes: `e21737ab15` ("virtio-blk: multiqueue batch notify") Cc: qemu-stable@nongnu.org Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Li Hangjing <lihangjing@baidu.com> Reviewed-by: Xie Yongji <xieyongji@baidu.com> Reviewed-by: Chai Wen <chaiwen@baidu.com> Message-id: 20191216023050.48620-1-lihangjing@baidu.com Message-Id: <20191216023050.48620-1-lihangjing@baidu.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-12-19 16:20:25 +00:00
Philippe Mathieu-Daudé	ebb6ff25cd	virtio-blk: Cancel the pending BH when the dataplane is reset When 'system_reset' is called, the main loop clear the memory region cache before the BH has a chance to execute. Later when the deferred function is called, some assumptions that were made when scheduling them are no longer true when they actually execute. This is what happens using a virtio-blk device (fresh RHEL7.8 install): $ (sleep 12.3; echo system_reset; sleep 12.3; echo system_reset; sleep 1; echo q) \ \| qemu-system-x86_64 -m 4G -smp 8 -boot menu=on \ -device virtio-blk-pci,id=image1,drive=drive_image1 \ -drive file=/var/lib/libvirt/images/rhel78.qcow2,if=none,id=drive_image1,format=qcow2,cache=none \ -device virtio-net-pci,netdev=net0,id=nic0,mac=52:54:00:c4:e7:84 \ -netdev tap,id=net0,script=/bin/true,downscript=/bin/true,vhost=on \ -monitor stdio -serial null -nographic (qemu) system_reset (qemu) system_reset (qemu) qemu-system-x86_64: hw/virtio/virtio.c:225: vring_get_region_caches: Assertion `caches != NULL' failed. Aborted (gdb) bt Thread 1 (Thread 0x7f109c17b680 (LWP 10939)): #0 0x00005604083296d1 in vring_get_region_caches (vq=0x56040a24bdd0) at hw/virtio/virtio.c:227 #1 0x000056040832972b in vring_avail_flags (vq=0x56040a24bdd0) at hw/virtio/virtio.c:235 #2 0x000056040832d13d in virtio_should_notify (vdev=0x56040a240630, vq=0x56040a24bdd0) at hw/virtio/virtio.c:1648 #3 0x000056040832d1f8 in virtio_notify_irqfd (vdev=0x56040a240630, vq=0x56040a24bdd0) at hw/virtio/virtio.c:1662 #4 0x00005604082d213d in notify_guest_bh (opaque=0x56040a243ec0) at hw/block/dataplane/virtio-blk.c:75 #5 0x000056040883dc35 in aio_bh_call (bh=0x56040a243f10) at util/async.c:90 #6 0x000056040883dccd in aio_bh_poll (ctx=0x560409161980) at util/async.c:118 #7 0x0000560408842af7 in aio_dispatch (ctx=0x560409161980) at util/aio-posix.c:460 #8 0x000056040883e068 in aio_ctx_dispatch (source=0x560409161980, callback=0x0, user_data=0x0) at util/async.c:261 #9 0x00007f10a8fca06d in g_main_context_dispatch () at /lib64/libglib-2.0.so.0 #10 0x0000560408841445 in glib_pollfds_poll () at util/main-loop.c:215 #11 0x00005604088414bf in os_host_main_loop_wait (timeout=0) at util/main-loop.c:238 #12 0x00005604088415c4 in main_loop_wait (nonblocking=0) at util/main-loop.c:514 #13 0x0000560408416b1e in main_loop () at vl.c:1923 #14 0x000056040841e0e8 in main (argc=20, argv=0x7ffc2c3f9c58, envp=0x7ffc2c3f9d00) at vl.c:4578 Fix this by cancelling the BH when the virtio dataplane is stopped. [This is version of the patch was modified as discussed with Philippe on the mailing list thread. --Stefan] Reported-by: Yihuang Yu <yihyu@redhat.com> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Fixes: https://bugs.launchpad.net/qemu/+bug/1839428 Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190816171503.24761-1-philmd@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2019-09-03 16:11:18 +01:00
Markus Armbruster	db72581598	Include qemu/main-loop.h less In my "build everything" tree, changing qemu/main-loop.h triggers a recompile of some 5600 out of 6600 objects (not counting tests and objects that don't depend on qemu/osdep.h). It includes block/aio.h, which in turn includes qemu/event_notifier.h, qemu/notify.h, qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h, qemu/thread.h, qemu/timer.h, and a few more. Include qemu/main-loop.h only where it's needed. Touching it now recompiles only some 1700 objects. For block/aio.h and qemu/event_notifier.h, these numbers drop from 5600 to 2800. For the others, they shrink only slightly. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190812052359.30071-21-armbru@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>	2019-08-16 13:31:52 +02:00
Kevin Wolf	97896a4887	block: Add Error to blk_set_aio_context() Add an Error parameter to blk_set_aio_context() and use bdrv_child_try_set_aio_context() internally to check whether all involved nodes can actually support the AioContext switch. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-06-04 15:22:22 +02:00
Thomas Huth	a1d30f285e	Replace '-enable-kvm' with '-accel kvm' in docs and help texts The preferred way to select the KVM accelerator is to use "-accel kvm" these days, so let's be consistent in our documentation and help texts. Signed-off-by: Thomas Huth <thuth@redhat.com> Message-Id: <1528866321-23886-3-git-send-email-thuth@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-06-28 19:05:32 +02:00
Philippe Mathieu-Daudé	0304f9ec9c	hw: Do not include "sysemu/block-backend.h" if it is not necessary Remove those unneeded includes to speed up the compilation process a little bit. (Continue `7eceff5b5a` cleanup) Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20180528232719.4721-13-f4bug@amsat.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Cornelia Huck <cohuck@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-06-01 14:15:10 +02:00
Stefan Hajnoczi	1010cadf62	virtio-blk: fix race between .ioeventfd_stop() and vq handler If the main loop thread invokes .ioeventfd_stop() just as the vq handler function begins in the IOThread then the handler may lose the race for the AioContext lock. By the time the vq handler is able to acquire the AioContext lock the ioeventfd has already been removed and the handler isn't supposed to run anymore! Use the new aio_wait_bh_oneshot() function to perform ioeventfd removal from within the IOThread. This way no races with the vq handler are possible. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20180307144205.20619-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-03-08 17:38:51 +00:00
Sergio Lopez	12c1c7d7ce	virtio-blk: dataplane: Don't batch notifications if EVENT_IDX is present Commit `5b2ffbe4d9` ("virtio-blk: dataplane: notify guest as a batch") deferred guest notification to a BH in order batch notifications, with purpose of avoiding flooding the guest with interruptions. This optimization came with a cost. The average latency perceived in the guest is increased by a few microseconds, but also when multiple IO operations finish at the same time, the guest won't be notified until all completions from each operation has been run. On the contrary, virtio-scsi issues the notification at the end of each completion. On the other hand, nowadays we have the EVENT_IDX feature that allows a better coordination between QEMU and the Guest OS to avoid sending unnecessary interruptions. With this change, virtio-blk/dataplane only batches notifications if the EVENT_IDX feature is not present. Some numbers obtained with fio (ioengine=sync, iodepth=1, direct=1): - Test specs: * fio-3.4 (ioengine=sync, iodepth=1, direct=1) * qemu master * virtio-blk with a dedicated iothread (default poll-max-ns) * backend: null_blk nr_devices=1 irqmode=2 completion_nsec=280000 * 8 vCPUs pinned to isolated physical cores * Emulator and iothread also pinned to separate isolated cores * variance between runs < 1% - Not patched * numjobs=1: lat_avg=327.32 irqs=29998 * numjobs=4: lat_avg=337.89 irqs=29073 * numjobs=8: lat_avg=342.98 irqs=28643 - Patched: * numjobs=1: lat_avg=323.92 irqs=30262 * numjobs=4: lat_avg=332.65 irqs=29520 * numjobs=8: lat_avg=335.54 irqs=29323 Signed-off-by: Sergio Lopez <slp@redhat.com> Message-id: 20180307114459.26636-1-slp@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-03-08 15:59:03 +00:00
Gal Hammer	76143618a5	virtio: remove event notifier cleanup call on de-assign The virtio_bus_set_host_notifier function no longer calls event_notifier_cleanup when a event notifier is removed. The commit updates the code to match the new behavior and calls virtio_bus_cleanup_host_notifier after the notifier was de-assign and no longer in use. This change is a preparation to allow executing the virtio_bus_set_host_notifier function in a memory region transaction. Signed-off-by: Gal Hammer <ghammer@redhat.com> Reviewed-by: Greg Kurz <groug@kaod.org> Tested-by: Greg Kurz <groug@kaod.org> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2018-02-08 21:06:26 +02:00
Mao Zhongyi	9d3b155186	hw/block: Fix the return type When the function no success value to transmit, it usually make the function return void. It has turned out not to be a success, because it means that the extra local_err variable and error_propagate() will be needed. It leads to cumbersome code, therefore, transmit success/ failure in the return value is worth. So fix the return type of blkconf_apply_backend_options(), blkconf_geometry() and virtio_blk_data_plane_create() to avoid it. Cc: John Snow <jsnow@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: ac0edc1fc70c4457e5cec94405eb7d1f89f9c2c1.1511317952.git.maozy.fnst@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-12-19 09:25:28 +00:00
Fam Zheng	0793169870	virtio: Report real progress in VQ aio poll handler In virtio_queue_host_notifier_aio_poll, not all "!virtio_queue_empty()" cases are making true progress. Currently the offending one is virtio-scsi event queue, whose handler does nothing if no event is pending. As a result aio_poll() will spin on the "non-empty" VQ and take 100% host CPU. Fix this by reporting actual progress from virtio queue aio handlers. Reported-by: Ed Swierk <eswierk@skyportsystems.com> Signed-off-by: Fam Zheng <famz@redhat.com> Tested-by: Ed Swierk <eswierk@skyportsystems.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2017-02-17 21:52:30 +02:00
Paolo Bonzini	83d768b564	virtio: set ISR on dataplane notifications Dataplane has been omitting forever the step of setting ISR when an interrupt is raised. This caused little breakage, because the specification actually says that ISR may not be updated in MSI mode. Some versions of the Windows drivers however didn't clear MSI mode correctly, and proceeded using polling mode (using ISR, not the used ring index!) for crashdump and hibernation. If it were just crashdump and hibernation it would not be a big deal, but recent releases of Windows do not really shut down, but rather log out and hibernate to make the next startup faster. Hence, this manifested as a more serious hang during shutdown with e.g. Windows 8.1 and virtio-win 1.8.0 RPMs. Newer versions fixed this, while older versions do not use MSI at all. The failure has always been there for virtio dataplane, but it became visible after commits `9ffe337` ("virtio-blk: always use dataplane path if ioeventfd is active", 2016-10-30) and `ad07cd6` ("virtio-scsi: always use dataplane path if ioeventfd is active", 2016-10-30) made virtio-blk and virtio-scsi always use the dataplane code under KVM. The good news therefore is that it was not a bug in the patches---they were doing exactly what they were meant for, i.e. shake out remaining dataplane bugs. The fix is not hard, so it's worth arranging for the broken drivers. The virtio_should_notify+event_notifier_set pair that is common to virtio-blk and virtio-scsi dataplane is replaced with a new public function virtio_notify_irqfd that also sets ISR. The irqfd emulation code now need not set ISR anymore, so virtio_irq is removed. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com> Tested-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-11-18 17:29:25 +02:00
Paolo Bonzini	9ffe337c08	virtio-blk: always use dataplane path if ioeventfd is active Override start_ioeventfd and stop_ioeventfd to start/stop the whole dataplane logic. This has some positive side effects: - no need anymore for virtio_add_queue_aio (i.e. a revert of commit `0ff841f6d1`) - no need anymore to switch from generic ioeventfd handlers to dataplane It detects some errors better: $ qemu-system-x86_64 -object iothread,id=io \ -drive id=null,file=null-aio://,if=none,format=raw \ -device virtio-blk-pci,ioeventfd=off,iothread=io,drive=null qemu-system-x86_64: -device virtio-blk-pci,ioeventfd=off,iothread=io,drive=null: ioeventfd is required for iothread while previously it would have started just fine. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-10-30 19:51:32 +02:00
Paolo Bonzini	b13d396227	virtio: move ioeventfd_started flag to VirtioBusState This simplifies the code and removes the ioeventfd_started and ioeventfd_set_started callback. The only difference is in how virtio-ccw handles an error---it doesn't disable ioeventfd forever anymore. It was the only backend to do so, and if desired this behavior should be implemented in virtio-bus.c. Instead of ioeventfd_started, the ioeventfd_assign callback now determines whether the virtio bus supports host notifiers. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-10-30 19:51:32 +02:00
Cao jin	ab3b9c1be8	virtio-blk: dataplane cleanup No need duplicate the judgment, there is one in function entry. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Max Reitz <mreitz@redhat.com> Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1468814749-14510-1-git-send-email-caoj.fnst@cn.fujitsu.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-18 15:10:52 +01:00
Stefan Hajnoczi	51b04ac5c6	virtio-blk: dataplane multiqueue support Monitor ioeventfds for all virtqueues in the device's AioContext. This is not true multiqueue because requests from all virtqueues are processed in a single IOThread. In the future it will be possible to use multiple IOThreads when the QEMU block layer supports multiqueue. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1466511196-12612-7-git-send-email-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-28 13:08:32 +01:00
Stefan Hajnoczi	b234cdda95	virtio-blk: tell dataplane which vq to notify Let the virtio_blk_data_plane_notify() caller decide which virtqueue to notify. This will allow the function to be used with multiqueue. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1466511196-12612-4-git-send-email-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-28 13:08:32 +01:00
Stefan Hajnoczi	e21737ab15	virtio-blk: multiqueue batch notify The batch notification BH needs to know which virtqueues to notify when multiqueue is enabled. Use a bitmap to track the virtqueues with pending notifications. At this point there is only one virtqueue so hard-code virtqueue index 0. A later patch will switch to real virtqueue indices. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 1466511196-12612-3-git-send-email-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-28 13:08:31 +01:00
Cornelia Huck	21a4d96243	virtio-bus: remove old set_host_notifier callback All users have been converted to the new ioevent callbacks. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-06-24 08:47:35 +03:00
Cornelia Huck	b1f0a33d80	virtio-bus: have callers tolerate new host notifier api Have vhost and dataplane use the new api for transports that have been converted. Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-06-24 08:47:35 +03:00
Fam Zheng	3482958383	virtio-blk: Remove op blocker for dataplane Block layer is prepared to unspecialize dataplane, an evidence is this almost complete list of unblocked operations. It has all types except two (actually three if DATAPLANE itself counts but blockdev.c makes sure attaching twice is not possible): MIRROR_TARGET and BACKUP_TARGET. blockdev-mirror refuses to start if target is attached, so the first is not a problem. By removing BACKUP_TARGET, blockdev-backup will become permissive to write to a virtio-blk dataplane disk, but that is not worse than non-dataplane given the latter is already possible. In either case, blockdev.c always checks the target and source are on the same AioContext, or bring them together if possible. Signed-off-by: Fam Zheng <famz@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Message-id: 1463969978-24970-4-git-send-email-famz@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-06-07 14:40:51 +01:00
Paolo Bonzini	a378b49a43	virtio: merge virtio_queue_aio_set_host_notifier_handler with virtio_queue_set_aio Eliminating the reentrancy is actually a nice thing that we can do with the API that Michael proposed, so let's make it first class. This also hides the complex assign/set_handler conventions from callers of virtio_queue_aio_set_host_notifier_handler, which in fact was always called with assign=true. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-04-07 19:57:33 +03:00
Michael S. Tsirkin	8a2fad57eb	virtio-blk: use aio handler for data plane In addition to handling IO in vcpu thread and in io thread, dataplane introduces yet another mode: handling it by AioContext. This reuses the same handler as previous modes, which triggers races as these were not designed to be reentrant. Use a separate handler just for aio, and disable regular handlers when dataplane is active. Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-04-07 19:57:33 +03:00
Paolo Bonzini	eb41cf78fc	virtio-blk: fix disabled mode We must not call virtio_blk_data_plane_notify if dataplane is disabled: we would hit a segmentation fault in notify_guest_bh as s->guest_notifier has not been setup and is NULL. Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-04-07 19:57:33 +03:00
Markus Armbruster	da34e65cb4	include/qemu/osdep.h: Don't include qapi/error.h Commit `57cb38b` included qapi/error.h into qemu/osdep.h to get the Error typedef. Since then, we've moved to include qemu/osdep.h everywhere. Its file comment explains: "To avoid getting into possible circular include dependencies, this file should not include any other QEMU headers, with the exceptions of config-host.h, compiler.h, os-posix.h and os-win32.h, all of which are doing a similar job to this file and are under similar constraints." qapi/error.h doesn't do a similar job, and it doesn't adhere to similar constraints: it includes qapi-types.h. That's in excess of 100KiB of crap most .c files don't actually need. Add the typedef to qemu/typedefs.h, and include that instead of qapi/error.h. Include qapi/error.h in .c files that need it and don't get it now. Include qapi-types.h in qom/object.h for uint16List. Update scripts/clean-includes accordingly. Update it further to match reality: replace config.h by config-target.h, add sysemu/os-posix.h, sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h comment quoted above similarly. This reduces the number of objects depending on qapi/error.h from "all of them" to less than a third. Unfortunately, the number depending on qapi-types.h shrinks only a little. More work is needed for that one. Signed-off-by: Markus Armbruster <armbru@redhat.com> [Fix compilation without the spice devel packages. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-22 22:20:15 +01:00
Paolo Bonzini	03de2f5274	virtio-blk: do not use vring in dataplane Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-02-25 13:14:18 +02:00
Paolo Bonzini	2906cddfec	virtio-blk: fix "disabled data plane" mode In disabled mode, virtio-blk dataplane seems to be enabled, but flow actually goes through the normal virtio path. This patch simplifies a bit the handling of disabled mode. In disabled mode, virtio_blk_handle_output might be called even if s->dataplane is not NULL. This is a bit tricky, because the current check for s->dataplane will always trigger, causing a continuous stream of calls to virtio_blk_data_plane_start. Unfortunately, these calls will not do anything. To fix this, set the "started" flag even in disabled mode, and skip virtio_blk_data_plane_start if the started flag is true. The resulting changes also prepare the code for the next patch, were virtio-blk dataplane will reuse the same virtio_blk_handle_output function as "regular" virtio-blk. Because struct VirtIOBlockDataPlane is opaque in virtio-blk.c, we have to move s->dataplane->started inside struct VirtIOBlock. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-02-25 13:14:18 +02:00
Paolo Bonzini	8b1fe1cedf	vring: make vring_enable_notification return void Make the API more similar to the regular virtqueue API. This will help when modifying the code to not use vring.c anymore. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Fam Zheng <famz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-02-25 13:14:18 +02:00
Paolo Bonzini	51b19ebe43	virtio: move allocation to virtqueue_pop/vring_pop The return code of virtqueue_pop/vring_pop is unused except to check for errors or 0. We can thus easily move allocation inside the functions and just return a pointer to the VirtQueueElement. The advantage is that we will be able to allocate only the space that is needed for the actual size of the s/g list instead of the full VIRTQUEUE_MAX_SIZE items. Currently VirtQueueElement takes about 48K of memory, and this kind of allocation puts a lot of stress on malloc. By cutting the size by two or three orders of magnitude, malloc can use much more efficient algorithms. The patch is pretty large, but changes to each device are testable more or less independently. Splitting it would mostly add churn. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>	2016-02-06 20:39:07 +02:00
Max Reitz	1b1e0659a4	virtio-blk: Functions for op blocker management Put the code for setting up and removing op blockers into an own function, respectively. Then, we can invoke those functions whenever a BDS is removed from an virtio-blk BB or inserted into it. Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-02-02 17:50:46 +01:00
Peter Maydell	80c71a241a	block: Clean up includes Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-01-20 13:36:23 +01:00
Markus Armbruster	e43bfd9c87	error: Use error_prepend() where it makes obvious sense Done with this Coccinelle semantic patch @@ expression FMT, E1, E2; expression list ARGS; @@ - error_setg(E1, FMT, ARGS, error_get_pretty(E2)); + error_propagate(E1, E2);/###/ + error_prepend(E1, FMT/@@@/, ARGS); followed by manual cleanup, first because I can't figure out how to make Coccinelle transform strings, and second to get rid of now superfluous error_propagate(). We now use or propagate the original error whole instead of just its message obtained with error_get_pretty(). This avoids suppressing its hint (see commit `50b7b00`), but I can't see how the errors touched in this commit could come with hints. It also improves the message printed with &error_abort when we screw up (see commit `1e9b65b`). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2016-01-13 15:16:17 +01:00
Fam Zheng	05e4d14bf3	block: Rename BLOCK_OP_TYPE_MIRROR to BLOCK_OP_TYPE_MIRROR_SOURCE It's necessary to distinguish source and target before we can add blockdev-mirror, because we would want a concrete type of operation to check on target bs before starting. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1450932306-13717-2-git-send-email-famz@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2016-01-07 21:30:17 +01:00
Fam Zheng	a616fb75c2	virtio-blk: Drop x-data-plane option The official way of enabling dataplane is through the "iothread" property that references an iothread object created by "-object iothread". Since the old "x-data-plane=on" way now even crashes, it's probably easier to just drop it: $ qemu-system-x86_64 -drive file=null-co://,id=d0,if=none \ -device virtio-blk-pci,drive=d0,x-data-plane=on ERROR:/home/fam/work/qemu/qom/object.c:1515: object_get_canonical_path_component: assertion failed: (obj->parent != NULL) Aborted Signed-off-by: Fam Zheng <famz@redhat.com> Message-id: 1449485967-19240-1-git-send-email-famz@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2015-12-07 16:47:16 +00:00
Fam Zheng	3a1e8074d7	dataplane: Mark host notifiers' client type as "external" They will be excluded by type in the nested event loops in block layer, so that unwanted events won't be processed there. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-10-23 18:18:24 +02:00
Fam Zheng	dca21ef23b	aio: Add "is_external" flag for event handlers All callers pass in false, and the real external ones will switch to true in coming patches. Signed-off-by: Fam Zheng <famz@redhat.com> Reviewed-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-10-23 18:18:23 +02:00
Paolo Bonzini	fed105e275	virtio-blk-dataplane: delete bottom half before the AioContext is freed Other uses of aio_bh_new are safe as long as all scheduled bottom halves are run before an iothread is destroyed, which bdrv_drain will ensure: - archipelago_finish_aiocb: BH deletes itself - inject_error: BH deletes itself - blkverify_aio_bh: BH deletes itself - abort_aio_request: BH deletes itself - curl_aio_readv: BH deletes itself - gluster_finish_aiocb: BH deletes itself - bdrv_aio_rw_vector: BH deletes itself - bdrv_co_maybe_schedule_bh: BH deletes itself - iscsi_schedule_bh, iscsi_co_generic_cb: BH deletes itself - laio_attach_aio_context: deleted in laio_detach_aio_context, called through bdrv_detach_aio_context before deleting the iothread - nfs_co_generic_cb: BH deletes itself - null_aio_common: BH deletes itself - qed_aio_complete: BH deletes itself - rbd_finish_aiocb: BH deletes itself - dma_blk_cb: BH deletes itself - virtio_blk_dma_restart_cb: BH deletes itself - qemu_bh_new: main loop AioContext is never destroyed - test-aio.c: bh_delete_cb deletes itself, otherwise deleted in the same function that calls aio_bh_new Reported-by: Cornelia Huck <cornelia.huck@de.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1438101249-25166-2-git-send-email-pbonzini@redhat.com Message-Id: <1438086628-13000-1-git-send-email-pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-07-29 10:02:06 +01:00
Paolo Bonzini	2a6cdd6d35	virtio-blk: correctly dirty guest memory After qemu_iovec_destroy, the QEMUIOVector's size is zeroed and the zero size ultimately is used to compute virtqueue_push's len argument. Therefore, reads from virtio-blk devices did not migrate their results correctly. (Writes were okay). Save the size in virtio_blk_handle_request, and use it when the request is completed. Based on a patch by Wen Congyang. Signed-off-by: Wen Congyang <wency@cn.fujitsu.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Tested-by: Li Zhijian <lizhijian@cn.fujitsu.com> Message-id: 1427997044-392-1-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-04-08 10:39:18 +01:00
Cornelia Huck	b0e5d90ebc	dataplane: endianness-aware accesses The vring.c code currently assumes that guest and host endianness match, which is not true for a number of cases: - emulating targets with a different endianness than the host - bi-endian targets, where the correct endianness depends on the virtio device - upcoming support for the virtio-1 standard mandates little-endian accesses even for big-endian targets and hosts Make sure to use accessors that depend on the virtio device. Note that dataplane now needs to be built per-target. Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Fam Zheng <famz@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Tested-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1422289602-17874-2-git-send-email-cornelia.huck@de.ibm.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-02-16 15:07:16 +00:00
Peter Lieven	95f7142abc	virtio-blk: introduce multiread this patch finally introduces multiread support to virtio-blk. While multiwrite support was there for a long time, read support was missing. The complete merge logic is moved into virtio-blk.c which has been the only user of request merging ever since. This is required to be able to merge chunks of requests and immediately invoke callbacks for those requests. Secondly, this is required to switch to direct invocation of coroutines which is planned at a later stage. The following benchmarks show the performance of running fio with 4 worker threads on a local ram disk. The numbers show the average of 10 test runs after 1 run as warmup phase. \| 4k \| 64k \| 4k MB/s \| rd seq \| rd rand \| rd seq \| rd rand \| wr seq \| wr rand --------------+--------+---------+--------+---------+--------+-------- master \| 1221 \| 1187 \| 4178 \| 4114 \| 1745 \| 1213 multiread \| 1829 \| 1189 \| 4639 \| 4110 \| 1894 \| 1216 Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-02-06 17:24:21 +01:00
Fam Zheng	bb00021de0	block: Split BLOCK_OP_TYPE_COMMIT to BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET} Like BLOCK_OP_TYPE_BACKUP_SOURCE and BLOCK_OP_TYPE_BACKUP_TARGET, block-commit involves two asymmetric devices. This change is not user-visible (yet), because commit only works with device names. But once we enable backing reference in blockdev-add, or specifying node-name in block-commit command, we don't want the user to start two commit jobs on the same backing chain, which will corrupt things because of the final bdrv_swap. Before we have per category blockers, splitting this type is still better. [Resolved virtio-blk dataplane conflict by replacing BLOCK_OP_TYPE_COMMIT with both BLOCK_OP_TYPE_COMMIT_{SOURCE, TARGET}. They are safe since the block job runs in the same AioContext as the dataplane IOThread. --Stefan] Signed-off-by: Fam Zheng <famz@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-01-13 13:43:29 +00:00
Stefan Hajnoczi	5d6e96efb8	blockdev: acquire AioContext in QMP 'transaction' actions The transaction QMP command performs operations atomically on a group of drives. This command needs to acquire AioContext in order to work safely when virtio-blk dataplane IOThreads are accessing drives. The transactional nature of the command means that actions are split into prepare, commit, abort, and clean functions. Acquire the AioContext in prepare and don't release it until one of the other functions is called. This prevents the IOThread from running the AioContext before the transaction has completed. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1416566940-4430-4-git-send-email-stefanha@redhat.com Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-12-10 10:31:13 +01:00
Stefan Hajnoczi	729962f6db	blockdev: acquire AioContext in change-backing-file Add dataplane support to the change-backing-file QMP commands. By acquiring the AioContext we avoid race conditions with the dataplane thread which may also be accessing the BlockDriverState. Note that this command operates on both bs and a node in its chain (image_bs). The bdrv_chain_contains(bs, image_bs) check guarantees that bs and image_bs are in the same AioContext. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-12-10 10:25:30 +01:00
Stefan Hajnoczi	e3442099a2	blockdev: acquire AioContext in eject, change, and block_passwd By acquiring the AioContext we avoid race conditions with the dataplane thread which may also be accessing the BlockDriverState. Fix up eject, change, and block_passwd in a single patch because qmp_eject() and qmp_change_blockdev() both call eject_device(). Also fix block_passwd while we're tackling a command that takes a block encryption password. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-12-10 10:25:30 +01:00
Stefan Hajnoczi	4ef3982a99	blockdev: acquire AioContext in blockdev-snapshot-delete-internal-sync Add dataplane support to the blockdev-snapshot-delete-internal-sync QMP command. By acquiring the AioContext we avoid race conditions with the dataplane thread which may also be accessing the BlockDriverState. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-12-10 10:25:30 +01:00
Stefan Hajnoczi	b112a65c52	block: declare blockjobs and dataplane friends! Now that blockjobs use AioContext they are safe for use with dataplane. Unblock them! Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 1413889440-32577-12-git-send-email-stefanha@redhat.com	2014-11-03 11:41:49 +00:00
Markus Armbruster	4be746345f	hw: Convert from BlockDriverState to BlockBackend, mostly Device models should access their block backends only through the block-backend.h API. Convert them, and drop direct includes of inappropriate headers. Just four uses of BlockDriverState are left: * The Xen paravirtual block device backend (xen_disk.c) opens images itself when set up via xenbus, bypassing blockdev.c. I figure it should go through qmp_blockdev_add() instead. * Device model "usb-storage" prompts for keys. No other device model does, and this one probably shouldn't do it, either. * ide_issue_trim_cb() uses bdrv_aio_discard() instead of blk_aio_discard() because it fishes its backend out of a BlockAIOCB, which has only the BlockDriverState. * PC87312State has an unused BlockDriverState[] member. The next two commits take care of the latter two. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-10-20 14:02:25 +02:00
Markus Armbruster	2a30307f70	virtio-blk: Rename VirtIOBlkConf variables to conf This is consistent with how VirtIOFOOConf variables are named elsewhere, and makes blk available for BlockBackend variables. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-10-20 14:02:21 +02:00

1 2

92 Commits