qemu-e2k

Author	SHA1	Message	Date
Paolo Bonzini	050de36b13	coroutine-lock: Reimplement CoRwlock to fix downgrade bug An invariant of the current rwlock is that if multiple coroutines hold a reader lock, all must be runnable. The unlock implementation relies on this, choosing to wake a single coroutine when the final read lock holder exits the critical section, assuming that it will wake a coroutine attempting to acquire a write lock. The downgrade implementation violates this assumption by creating a read lock owning coroutine that is exclusively runnable - any other coroutines that are waiting to acquire a read lock are not made runnable when the write lock holder converts its ownership to read only. More in general, the old implementation had lots of other fairness bugs. The root cause of the bugs was that CoQueue would wake up readers even if there were pending writers, and would wake up writers even if there were readers. In that case, the coroutine would go back to sleep at the end of the CoQueue, losing its place at the head of the line. To fix this, keep the queue of waiters explicitly in the CoRwlock instead of using CoQueue, and store for each whether it is a potential reader or a writer. This way, downgrade can look at the first queued coroutines and wake it only if it is a reader, causing all other readers in line to be released in turn. Reported-by: David Edmondson <david.edmondson@oracle.com> Reviewed-by: David Edmondson <david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20210325112941.365238-5-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-03-31 10:44:21 +01:00
David Edmondson	2f6ef0393b	coroutine-lock: Store the coroutine in the CoWaitRecord only once When taking the slow path for mutex acquisition, set the coroutine value in the CoWaitRecord in push_waiter(), rather than both there and in the caller. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: David Edmondson <david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20210325112941.365238-4-pbonzini@redhat.com Message-Id: <20210309144015.557477-4-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-03-31 10:44:21 +01:00
Marc-André Lureau	e18d9a9687	coroutine: let CoQueue wake up outside a coroutine The assert() was added in commit `b681a1c73e` ("block: Repair the throttling code."), when the qemu_co_queue_do_restart() function required to be running in a coroutine. It was later made unnecessary in commit `a9d9235567` ("coroutine-lock: reschedule coroutine on the AioContext it was running on"). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Gerd Hoffmann <kraxel@redhat.com> Message-id: 20201027133602.3038018-2-marcandre.lureau@redhat.com Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>	2020-11-04 08:02:24 +01:00
Stefan Hajnoczi	d73415a315	qemu/atomic.h: rename atomic_ to qatomic_ clang's C11 atomic_fetch_() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int ' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic$64$\?_[a-z0-9_]\+' include/qemu/atomic.h \| \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>	2020-09-23 16:07:44 +01:00
Markus Armbruster	a8d2532645	Include qemu-common.h exactly where needed No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]	2019-06-12 13:20:20 +02:00
Stefan Hajnoczi	c40a254570	coroutine: avoid co_queue_wakeup recursion qemu_aio_coroutine_enter() is (indirectly) called recursively when processing co_queue_wakeup. This can lead to stack exhaustion. This patch rewrites co_queue_wakeup in an iterative fashion (instead of recursive) with bounded memory usage to prevent stack exhaustion. qemu_co_queue_run_restart() is inlined into qemu_aio_coroutine_enter() and the qemu_coroutine_enter() call is turned into a loop to avoid recursion. There is one change that is worth mentioning: Previously, when coroutine A queued coroutine B, qemu_co_queue_run_restart() entered coroutine B from coroutine A. If A was terminating then it would still stay alive until B yielded. After this patch B is entered by A's parent so that a A can be deleted immediately if it is terminating. It is safe to make this change since B could never interact with A if it was terminating anyway. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 20180322152834.12656-3-stefanha@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-03-27 13:05:28 +01:00
Marc-André Lureau	d2f668b749	misc: fix spelling s/pupulate/populate Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20180208162447.10851-1-marcandre.lureau@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2018-02-15 09:39:49 +00:00
Paolo Bonzini	5261dd7b01	coroutine-lock: make qemu_co_enter_next thread-safe qemu_co_queue_next does not need to release and re-acquire the mutex, because the queued coroutine does not run immediately. However, this does not hold for qemu_co_enter_next. Now that qemu_co_queue_wait can synchronize (via QemuLockable) with code that is not running in coroutine context, it's important that code using qemu_co_enter_next can easily use a standardized locking idiom. First of all, qemu_co_enter_next must use aio_co_wake to restart the coroutine. Second, the function gains a second argument, a QemuLockable*, and the comments of qemu_co_queue_next and qemu_co_queue_restart_all are adjusted to clarify the difference. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180203153935.8056-5-pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2018-02-08 09:22:03 +08:00
Paolo Bonzini	1a957cf9c4	coroutine-lock: convert CoQueue to use QemuLockable There are cases in which a queued coroutine must be restarted from non-coroutine context (with qemu_co_enter_next). In this cases, qemu_co_enter_next also needs to be thread-safe, but it cannot use a CoMutex and so cannot qemu_co_queue_wait. Use QemuLockable so that the CoQueue can interchangeably use CoMutex or QemuMutex. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180203153935.8056-4-pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2018-02-08 09:22:03 +08:00
Paolo Bonzini	667221c10d	coroutine-lock: add qemu_co_rwlock_downgrade and qemu_co_rwlock_upgrade These functions are more efficient in the presence of contention. qemu_co_rwlock_downgrade also guarantees not to block, which may be useful in some algorithms too. Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20170629132749.997-3-pbonzini@redhat.com> Signed-off-by: Fam Zheng <famz@redhat.com>	2017-07-17 11:28:15 +08:00
Roman Pen	528f449f59	coroutine-lock: do not touch coroutine after another one has been entered Submission of requests on linux aio is a bit tricky and can lead to requests completions on submission path: `44713c9e85` ("linux-aio: Handle io_submit() failure gracefully") `0ed93d84ed` ("linux-aio: process completions from ioq_submit()") That means that any coroutine which has been yielded in order to wait for completion can be resumed from submission path and be eventually terminated (freed). The following use-after-free crash was observed when IO throttling was enabled: Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7f5813dff700 (LWP 56417)] virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252 (gdb) bt #0 virtqueue_unmap_sg (elem=0x7f5804009a30, len=1, vq=<optimized out>) at virtio.c:252 ^^^^^^^^^^^^^^ remember the address #1 virtqueue_fill (vq=0x5598b20d21b0, elem=0x7f5804009a30, len=1, idx=0) at virtio.c:282 #2 virtqueue_push (vq=0x5598b20d21b0, elem=elem@entry=0x7f5804009a30, len=<optimized out>) at virtio.c:308 #3 virtio_blk_req_complete (req=req@entry=0x7f5804009a30, status=status@entry=0 '\000') at virtio-blk.c:61 #4 virtio_blk_rw_complete (opaque=<optimized out>, ret=0) at virtio-blk.c:126 #5 blk_aio_complete (acb=0x7f58040068d0) at block-backend.c:923 #6 coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:78 (gdb) p * elem $8 = {index = 77, out_num = 2, in_num = 1, in_addr = 0x7f5804009ad8, out_addr = 0x7f5804009ae0, in_sg = 0x0, out_sg = 0x7f5804009a50} ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ 'in_sg' and 'out_sg' are invalid. e.g. it is impossible that 'in_sg' is zero, instead its value must be equal to: (gdb) p/x 0x7f5804009ad8 + sizeof(elem->in_addr[0]) + 2 * sizeof(elem->out_addr[0]) $26 = 0x7f5804009af0 Seems 'elem' was corrupted. Meanwhile another thread raised an abort: Thread 12 (Thread 0x7f57f2ffd700 (LWP 56426)): #0 raise () from /lib/x86_64-linux-gnu/libc.so.6 #1 abort () from /lib/x86_64-linux-gnu/libc.so.6 #2 qemu_coroutine_enter (co=0x7f5804009af0) at qemu-coroutine.c:113 #3 qemu_co_queue_run_restart (co=0x7f5804009a30) at qemu-coroutine-lock.c:60 #4 qemu_coroutine_enter (co=0x7f5804009a30) at qemu-coroutine.c:119 ^^^^^^^^^^^^^^^^^^ WTF?? this is equal to elem from crashed thread #5 qemu_co_queue_run_restart (co=0x7f57e7f16ae0) at qemu-coroutine-lock.c:60 #6 qemu_coroutine_enter (co=0x7f57e7f16ae0) at qemu-coroutine.c:119 #7 qemu_co_queue_run_restart (co=0x7f5807e112a0) at qemu-coroutine-lock.c:60 #8 qemu_coroutine_enter (co=0x7f5807e112a0) at qemu-coroutine.c:119 #9 qemu_co_queue_run_restart (co=0x7f5807f17820) at qemu-coroutine-lock.c:60 #10 qemu_coroutine_enter (co=0x7f5807f17820) at qemu-coroutine.c:119 #11 qemu_co_queue_run_restart (co=0x7f57e7f18e10) at qemu-coroutine-lock.c:60 #12 qemu_coroutine_enter (co=0x7f57e7f18e10) at qemu-coroutine.c:119 #13 qemu_co_enter_next (queue=queue@entry=0x5598b1e742d0) at qemu-coroutine-lock.c:106 #14 timer_cb (blk=0x5598b1e74280, is_write=<optimized out>) at throttle-groups.c:419 Crash can be explained by access of 'co' object from the loop inside qemu_co_queue_run_restart(): while ((next = QSIMPLEQ_FIRST(&co->co_queue_wakeup))) { QSIMPLEQ_REMOVE_HEAD(&co->co_queue_wakeup, co_queue_next); ^^^^^^^^^^^^^^^^^^^^ on each iteration 'co' is accessed, but 'co' can be already freed qemu_coroutine_enter(next); } When 'next' coroutine is resumed (entered) it can in its turn resume 'co', and eventually free it. That's why we see 'co' (which was freed) has the same address as 'elem' from the first backtrace. The fix is obvious: use temporary queue and do not touch coroutine after first qemu_coroutine_enter() is invoked. The issue is quite rare and happens every ~12 hours on very high IO and CPU load (building linux kernel with -j512 inside guest) when IO throttling is enabled. With the fix applied guest is running ~35 hours and is still alive so far. Signed-off-by: Roman Pen <roman.penyaev@profitbricks.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20170601160847.23720-1-roman.penyaev@profitbricks.com Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Fam Zheng <famz@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Cc: Kevin Wolf <kwolf@redhat.com> Cc: qemu-devel@nongnu.org Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-06-07 14:39:00 +01:00
Paolo Bonzini	a7b91d35ba	coroutine-lock: make CoRwlock thread-safe and fair This adds a CoMutex around the existing CoQueue. Because the write-side can just take CoMutex, the old "writer" field is not necessary anymore. Instead of removing it altogether, count the number of pending writers during a read-side critical section and forbid further readers from entering. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-7-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-02-21 11:39:40 +00:00
Paolo Bonzini	1ace7ceac5	coroutine-lock: add mutex argument to CoQueue APIs All that CoQueue needs in order to become thread-safe is help from an external mutex. Add this to the API. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-6-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-02-21 11:39:40 +00:00
Paolo Bonzini	480cff6322	coroutine-lock: add limited spinning to CoMutex Running a very small critical section on pthread_mutex_t and CoMutex shows that pthread_mutex_t is much faster because it doesn't actually go to sleep. What happens is that the critical section is shorter than the latency of entering the kernel and thus FUTEX_WAIT always fails. With CoMutex there is no such latency but you still want to avoid wait and wakeup. So introduce it artificially. This only works with one waiters; because CoMutex is fair, it will always have more waits and wakeups than a pthread_mutex_t. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-3-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-02-21 11:39:40 +00:00
Paolo Bonzini	fed20a70e3	coroutine-lock: make CoMutex thread-safe This uses the lock-free mutex described in the paper '"Blocking without Locking", or LFTHREADS: A lock-free thread library' by Gidenstam and Papatriantafilou. The same technique is used in OSv, and in fact the code is essentially a conversion to C of OSv's code. [Added missing coroutine_fn in tests/test-aio-multithread.c. --Stefan] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Message-id: 20170213181244.16297-2-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-02-21 11:39:40 +00:00
Paolo Bonzini	a9d9235567	coroutine-lock: reschedule coroutine on the AioContext it was running on As a small step towards the introduction of multiqueue, we want coroutines to remain on the same AioContext that started them, unless they are moved explicitly with e.g. aio_co_schedule. This patch avoids that coroutines switch AioContext when they use a CoMutex. For now it does not make much of a difference, because the CoMutex is not thread-safe and the AioContext itself is used to protect the CoMutex from concurrent access. However, this is going to change. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Message-id: 20170213135235.12274-9-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2017-02-21 11:14:08 +00:00
Kevin Wolf	1b7f01d966	coroutine: Assert that no locks are held on termination A coroutine that takes a lock must also release it again. If the coroutine terminates without having released all its locks, it's buggy and we'll probably run into a deadlock sooner or later. Make sure that we don't get such cases. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-09-05 19:06:48 +02:00
Kevin Wolf	0e438cdc93	coroutine: Let CoMutex remember who holds it In cases of deadlocks, knowing who holds a given CoMutex is really helpful for debugging. Keeping the information around doesn't cost much and allows us to add another assertion to keep the code correct, so let's just add it. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-09-05 19:06:48 +02:00
Paolo Bonzini	0b8b8753e4	coroutine: move entry argument to qemu_coroutine_create In practice the entry argument is always known at creation time, and it is confusing that sometimes qemu_coroutine_enter is used with a non-NULL argument to re-enter a coroutine (this happens in block/sheepdog.c and tests/test-coroutine.c). So pass the opaque value at creation time, for consistency with e.g. aio_bh_new. Mostly done with the following semantic patch: @ entry1 @ expression entry, arg, co; @@ - co = qemu_coroutine_create(entry); + co = qemu_coroutine_create(entry, arg); ... - qemu_coroutine_enter(co, arg); + qemu_coroutine_enter(co); @ entry2 @ expression entry, arg; identifier co; @@ - Coroutine co = qemu_coroutine_create(entry); + Coroutine co = qemu_coroutine_create(entry, arg); ... - qemu_coroutine_enter(co, arg); + qemu_coroutine_enter(co); @ entry3 @ expression entry, arg; @@ - qemu_coroutine_enter(qemu_coroutine_create(entry), arg); + qemu_coroutine_enter(qemu_coroutine_create(entry, arg)); @ reentry @ expression co; @@ - qemu_coroutine_enter(co, NULL); + qemu_coroutine_enter(co); except for the aforementioned few places where the semantic patch stumbled (as expected) and for test_co_queue, which would otherwise produce an uninitialized variable warning. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Paolo Bonzini	7d9c858137	coroutine: use QSIMPLEQ instead of QTAILQ CoQueue do not need to remove any element but the head of the list; processing is always strictly FIFO. Therefore, the simpler singly-linked QSIMPLEQ can be used instead. Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-13 13:26:02 +02:00
Peter Maydell	aafd758410	util: Clean up includes Clean up includes so that osdep.h is included first and headers which it implies are not included manually. This commit was created with scripts/clean-includes. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 1454089805-5470-6-git-send-email-peter.maydell@linaro.org	2016-02-04 17:01:04 +00:00
Daniel P. Berrange	10817bf09d	coroutine: move into libqemuutil.a library The coroutine files are currently referenced by the block-obj-y variable. The coroutine functionality though is already used by more than just the block code. eg migration code uses coroutine yield. In the future the I/O channel code will also use the coroutine yield functionality. Since the coroutine code is nicely self-contained it can be easily built as part of the libqemuutil.a library, making it widely available. The headers are also moved into include/qemu, instead of the include/block directory, since they are now part of the util codebase, and the impl was never in the block/ directory either. Signed-off-by: Daniel P. Berrange <berrange@redhat.com>	2015-10-20 14:59:04 +01:00

22 Commits