Commit Graph

14 Commits

Author SHA1 Message Date
Paolo Bonzini d1d1b206b0 coroutine-ucontext: use __thread
ELF thread local storage is about 10% faster on tests/test-coroutine's
perf/cost test.  The timing on my machine is 190ns per iteration with
pthread TLS, 170 with ELF TLS.

Based on a patch by Kevin Wolf and Peter Lieven, but redone to follow
the model of coroutine-win32.c (including the important "noinline"
attribute!).

Platforms without thread-local storage (OpenBSD probably?) will need
a new-enough GCC for this to compile, in order to use the same emutls
support that Windows already relies on.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1417518350-6167-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-01-13 13:43:28 +00:00
Markus Armbruster e6f53fd514 Fix warnings suppressors to honor --disable-werror
Replace

    #pragma GCC diagnostic ignored FOO
    [Troublesome code...]
    #pragma GCC diagnostic error FOO

by

    #pragma GCC diagnostic push
    #pragma GCC diagnostic ignored FOO
    [Troublesome code...]
    #pragma GCC diagnostic pop

Broken in commit 3f4349d, commit 092bb30, and commit c95e308.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1366113066-1340-1-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-04-17 10:28:04 -05:00
Anthony Liguori 864a556e9a Merge remote-tracking branch 'kwolf/for-anthony' into staging
# By Paolo Bonzini (7) and others
# Via Kevin Wolf
* kwolf/for-anthony: (22 commits)
  pc: add compatibility machine types for 1.4
  blockdev: enable discard by default
  qemu-nbd: add --discard option
  blockdev: add discard suboption to -drive
  block: implement BDRV_O_UNMAP
  block: complete all IOs before .bdrv_truncate
  coroutine: trim down nesting level in perf_nesting test
  coroutine: move pooling to common code
  qemu-iotests: Test qcow2 image creation options
  qemu-iotests: Add qemu-img compare test
  qemu-img: Add compare subcommand
  qemu-img: Add "Quiet mode" option
  block: Add synchronous wrapper for bdrv_co_is_allocated_above
  block: refuse negative iops and bps values
  block: use Error in do_check_io_limits()
  qcow2: support compressed clusters in BlockFragInfo
  qemu-img: add compressed clusters to BlockFragInfo
  qemu-img: fix missing space in qemu-img check output
  qcow2: record fragmentation statistics during check
  qcow2: introduce check_refcounts_l1/l2() flags
  ...
2013-02-26 07:44:39 -06:00
Peter Maydell 6ab7e5465a Replace all setjmp()/longjmp() with sigsetjmp()/siglongjmp()
The setjmp() function doesn't specify whether signal masks are saved and
restored; on Linux they are not, but on BSD (including MacOSX) they are.
We want to have consistent behaviour across platforms, so we should
always use "don't save/restore signal mask" (this is also generally
going to be faster). This also works around a bug in MacOSX where the
signal-restoration on longjmp() affects the signal mask for a completely
different thread, not just the mask for the thread which did the longjmp.
The most visible effect of this was that ctrl-C was ignored on MacOSX
because the CPU thread did a longjmp which resulted in its signal mask
being applied to every thread, so that all threads had SIGINT and SIGTERM
blocked.

The POSIX-sanctioned portable way to do a jump without affecting signal
masks is to siglongjmp() to a sigjmp_buf which was created by calling
sigsetjmp() with a zero savemask parameter, so change all uses of
setjmp()/longjmp() accordingly. [Technically POSIX allows sigsetjmp(buf, 0)
to save the signal mask; however the following siglongjmp() must not
restore the signal mask, so the pair can be effectively considered as
"sigjmp/longjmp which don't touch the mask".]

For Windows we provide a trivial sigsetjmp/siglongjmp in terms of
setjmp/longjmp -- this is OK because no user will ever pass a non-zero
savemask.

The setjmp() uses in tests/tcg/test-i386.c and tests/tcg/linux-test.c
are left untouched because these are self-contained singlethreaded
test programs intended to be run under QEMU's Linux emulation, so they
have neither the portability nor the multithreading issues to deal with.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Tested-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-02-23 16:11:19 +00:00
Paolo Bonzini 402397843e coroutine: move pooling to common code
The coroutine pool code is duplicated between the ucontext and
sigaltstack backends, and absent from the win32 backend.  But the
code can be shared easily by moving it to qemu-coroutine.c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-02-22 21:21:10 +01:00
Gerd Hoffmann cc6e3ca93c gcc: rename CONFIG_PRAGMA_DISABLE_UNUSED_BUT_SET to CONFIG_PRAGMA_DIAGNOSTIC_AVAILABLE
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2013-01-12 12:42:53 +00:00
Paolo Bonzini 737e150e89 block: move include files to include/block/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Peter Maydell 06d71fa148 configure: Split valgrind test into pragma test and valgrind.h test
Split the configure test that checks for valgrind into two, one
part checking whether we have the gcc pragma to disable unused-but-set
variables, and the other part checking for the existence of valgrind.h.
The first of these has to be compiled with -Werror and the second
does not and shouldn't generate any warnings.

This (a) allows us to enable "make errors in configure tests be
build failures" and (b) enables use of valgrind on systems with
a gcc which doesn't know about -Wunused-but-set-varibale, like
Debian squeeze.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-07-31 20:04:42 +00:00
Kevin Wolf 3f4349dc8b coroutine-ucontext: Help valgrind understand coroutines
valgrind tends to get confused and report false positives when you
switch stacks and don't tell it about it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-07-17 16:48:32 +02:00
Paolo Bonzini 1bbbdabd56 coroutine: switch to QSLIST
QSLIST can be used for a free list, do it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-02-17 08:33:33 -06:00
Avi Kivity 39a7a362e1 coroutine: switch per-thread free pool to a global pool
ucontext-based coroutines use a free pool to reduce allocations and
deallocations of coroutine objects.  The pool is per-thread, presumably
to improve locality.  However, as coroutines are usually allocated in
a vcpu thread and freed in the I/O thread, the pool accounting gets
screwed up and we end allocating and freeing a coroutine for every I/O
request.  This is expensive since large objects are allocated via the
kernel, and are not cached by the C runtime.

Fix by switching to a global pool.  This is safe since we're protected
by the global mutex.

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-15 12:40:33 +01:00
Anthony Liguori 7267c0947d Use glib memory allocation and free functions
qemu_malloc/qemu_free no longer exist after this commit.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-20 23:01:08 -05:00
malc 32b746775d Unbreak the build on ppc32
Signed-off-by: malc <av1474@comtv.ru>
2011-08-08 13:46:51 +04:00
Kevin Wolf 00dccaf1f8 coroutine: introduce coroutines
Asynchronous code is becoming very complex.  At the same time
synchronous code is growing because it is convenient to write.
Sometimes duplicate code paths are even added, one synchronous and the
other asynchronous.  This patch introduces coroutines which allow code
that looks synchronous but is asynchronous under the covers.

A coroutine has its own stack and is therefore able to preserve state
across blocking operations, which traditionally require callback
functions and manual marshalling of parameters.

Creating and starting a coroutine is easy:

  coroutine = qemu_coroutine_create(my_coroutine);
  qemu_coroutine_enter(coroutine, my_data);

The coroutine then executes until it returns or yields:

  void coroutine_fn my_coroutine(void *opaque) {
      MyData *my_data = opaque;

      /* do some work */

      qemu_coroutine_yield();

      /* do some more work */
  }

Yielding switches control back to the caller of qemu_coroutine_enter().
This is typically used to switch back to the main thread's event loop
after issuing an asynchronous I/O request.  The request callback will
then invoke qemu_coroutine_enter() once more to switch back to the
coroutine.

Note that if coroutines are used only from threads which hold the global
mutex they will never execute concurrently.  This makes programming with
coroutines easier than with threads.  Race conditions cannot occur since
only one coroutine may be active at any time.  Other coroutines can only
run across yield.

This coroutines implementation is based on the gtk-vnc implementation
written by Anthony Liguori <anthony@codemonkey.ws> but it has been
significantly rewritten by Kevin Wolf <kwolf@redhat.com> to use
setjmp()/longjmp() instead of the more expensive swapcontext() and by
Paolo Bonzini <pbonzini@redhat.com> for Windows Fibers support.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-08-01 12:14:09 +02:00