A gcc codegen bug in x86_64-w64-mingw32-gcc (GCC) 4.6.3 means that
non-debug builds of QEMU for Windows tend to assert when using
coroutines. Work around this by marking qemu_coroutine_switch
as noinline.
If we allow gcc to inline qemu_coroutine_switch into
coroutine_trampoline, then it hoists the code to get the
address of the TLS variable "current" out of the while() loop.
This is an invalid transformation because the SwitchToFiber()
call may be called when running thread A but return in thread B,
and so we might be in a different thread context each time
round the loop. This can happen quite often. Typically.
a coroutine is started when a VCPU thread does bdrv_aio_readv:
VCPU thread
main VCPU thread coroutine I/O coroutine
bdrv_aio_readv ----->
start I/O operation
thread_pool_submit_co
<------------ yields
back to emulation
Then I/O finishes and the thread-pool.c event notifier triggers in
the I/O thread. event_notifier_ready calls thread_pool_co_cb, and
the I/O coroutine now restarts *in another thread*:
iothread
main iothread coroutine I/O coroutine (formerly in VCPU thread)
event_notifier_ready
thread_pool_co_cb -----> current = I/O coroutine;
call AIO callback
But on Win32, because of the bug, the "current" being set here the
current coroutine of the VCPU thread, not the iothread.
noinline is a good-enough workaround, and quite unlikely to break in
the future.
(Thanks to Paolo Bonzini for assistance in diagnosing the problem
and providing the detailed example/ascii art quoted above.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1403535303-14939-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Asynchronous code is becoming very complex. At the same time
synchronous code is growing because it is convenient to write.
Sometimes duplicate code paths are even added, one synchronous and the
other asynchronous. This patch introduces coroutines which allow code
that looks synchronous but is asynchronous under the covers.
A coroutine has its own stack and is therefore able to preserve state
across blocking operations, which traditionally require callback
functions and manual marshalling of parameters.
Creating and starting a coroutine is easy:
coroutine = qemu_coroutine_create(my_coroutine);
qemu_coroutine_enter(coroutine, my_data);
The coroutine then executes until it returns or yields:
void coroutine_fn my_coroutine(void *opaque) {
MyData *my_data = opaque;
/* do some work */
qemu_coroutine_yield();
/* do some more work */
}
Yielding switches control back to the caller of qemu_coroutine_enter().
This is typically used to switch back to the main thread's event loop
after issuing an asynchronous I/O request. The request callback will
then invoke qemu_coroutine_enter() once more to switch back to the
coroutine.
Note that if coroutines are used only from threads which hold the global
mutex they will never execute concurrently. This makes programming with
coroutines easier than with threads. Race conditions cannot occur since
only one coroutine may be active at any time. Other coroutines can only
run across yield.
This coroutines implementation is based on the gtk-vnc implementation
written by Anthony Liguori <anthony@codemonkey.ws> but it has been
significantly rewritten by Kevin Wolf <kwolf@redhat.com> to use
setjmp()/longjmp() instead of the more expensive swapcontext() and by
Paolo Bonzini <pbonzini@redhat.com> for Windows Fibers support.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>