qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Paolo Bonzini	a0710f7995	iothread: release iothread around aio_poll This is the first step towards having fine-grained critical sections in dataplane threads, which resolves lock ordering problems between address_space_* functions (which need the BQL when doing MMIO, even after we complete RCU-based dispatch) and the AioContext. Because AioContext does not use contention callbacks anymore, the unit test has to be changed. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424449612-18215-4-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:08 +02:00
Paolo Bonzini	e98ab09709	aio-posix: move pollfds to thread-local storage By using thread-local storage, aio_poll can stop using global data during g_poll_ns. This will make it possible to drop callbacks from rfifolock. [Moved npfd = 0 assignment to end of walking_handlers region as suggested by Paolo. This resolves the assert(npfd == 0) assertion failure in pollfds_cleanup(). --Stefan] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 1424449612-18215-2-git-send-email-pbonzini@redhat.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-04-28 15:36:08 +02:00
Paolo Bonzini	e8d3b1a25f	aio: strengthen memory barriers for bottom half scheduling There are two problems with memory barriers in async.c. The fix is to use atomic_xchg in order to achieve sequential consistency between the scheduling of a bottom half and the corresponding execution. First, if bh->scheduled is already 1 in qemu_bh_schedule, QEMU does not execute a memory barrier to order any writes needed by the callback before the read of bh->scheduled. If the other side sees req->state as THREAD_ACTIVE, the callback is not invoked and you get deadlock. Second, the memory barrier in aio_bh_poll is too weak. Without this patch, it is possible that bh->scheduled = 0 is not "published" until after the callback has returned. Another thread wants to schedule the bottom half, but it sees bh->scheduled = 1 and does nothing. This causes a lost wakeup. The memory barrier should have been changed to smp_mb() in commit `924fe12` (aio: fix qemu_bh_schedule() bh->ctx race condition, 2014-06-03) together with qemu_bh_schedule()'s. Guess who reviewed that patch? Both of these involve a store and a load, so they are reproducible on x86_64 as well. It is however much easier on aarch64, where the libguestfs test suite triggers the bug fairly easily. Even there the failure can go away or appear depending on compiler optimization level, tracing options, or even kernel debugging options. Paul Leveille however reported how to trigger the problem within 15 minutes on x86_64 as well. His (untested) recipe, reproduced here for reference, is the following: 1) Qcow2 (or 3) is critical – raw files alone seem to avoid the problem. 2) Use “cache=directsync” rather than the default of “cache=none” to make it happen easier. 3) Use a server with a write-back RAID controller to allow for rapid IO rates. 4) Run a random-access load that (mostly) writes chunks to various files on the virtual block device. a. I use ‘diskload.exe c:25’, a Microsoft HCT load generator, on Windows VMs. b. Iometer can probably be configured to generate a similar load. 5) Run multiple VMs in parallel, against the same storage device, to shake the failure out sooner. 6) IvyBridge and Haswell processors for certain; not sure about others. A similar patch survived over 12 hours of testing, where an unpatched QEMU would fail within 15 minutes. This bug is, most likely, also the cause of failures in the libguestfs testsuite on AArch64. Thanks to Laszlo Ersek for initially reporting this bug, to Stefan Hajnoczi for suggesting closer examination of qemu_bh_schedule, and to Paul for providing test input and a prototype patch. Reported-by: Laszlo Ersek <lersek@redhat.com> Reported-by: Paul Leveille <Paul.Leveille@stratus.com> Reported-by: John Snow <jsnow@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-id: 1428419779-26062-1-git-send-email-pbonzini@redhat.com Suggested-by: Paul Leveille <Paul.Leveille@stratus.com> Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2015-04-09 10:29:29 +01:00
Paolo Bonzini	ee82310f8a	block: replace g_new0 with g_new for bottom half allocation. This saves about 15% of the clock cycles spent on allocation. Using the slice allocator does not add a visible improvement; allocation is faster than malloc, while freeing seems to be slower. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-01-13 11:47:56 +00:00
Paolo Bonzini	fcf5def1ab	block: mark AioContext as recursive AioContext can be accessed recursively, in fact that's what we do with aio_poll. Marking the GSource as recursive avoids that GLib blocks it and unblocks it around every call to aio_dispatch, which is a pretty expensive operation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2015-01-13 11:47:55 +00:00
Markus Armbruster	3ba235a022	block: Use g_new0() for a bit of extra type checking g_new(T, 1) is safer than g_malloc(sizeof(T)), because it returns T * rather than void *, which lets the compiler catch more type errors. Missed in commit `02c4f26`. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-id: 1417697709-13087-1-git-send-email-armbru@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-12-10 10:31:21 +01:00
Chrysostomos Nanakos	2f78e491d7	async: aio_context_new(): Handle event_notifier_init failure On a system with a low limit of open files the initialization of the event notifier could fail and QEMU exits without printing any error information to the user. The problem can be easily reproduced by enforcing a low limit of open files and start QEMU with enough I/O threads to hit this limit. The same problem raises, without the creation of I/O threads, while QEMU initializes the main event loop by enforcing an even lower limit of open files. This commit adds an error message on failure: # qemu [...] -object iothread,id=iothread0 -object iothread,id=iothread1 qemu: Failed to initialize event notifier: Too many open files in system Signed-off-by: Chrysostomos Nanakos <cnanakos@grnet.gr> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-09-22 11:39:48 +01:00
Paolo Bonzini	a3462c6561	AioContext: introduce aio_prepare This will be used to implement socket polling on Windows. On Windows, select() and g_poll() are completely different; sockets are polled with select() before calling g_poll, and the g_poll must be nonblocking if select() says a socket is ready. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-08-29 10:46:58 +01:00
Paolo Bonzini	e4c7e2d12d	AioContext: export and use aio_dispatch So far, aio_poll's scheme was dispatch/poll/dispatch, where the first dispatch phase was used only in the GSource case in order to avoid a blocking poll. Earlier patches changed it to dispatch/prepare/poll/dispatch, where prepare is aio_compute_timeout. By making aio_dispatch public, we can remove the first dispatch phase altogether, so that both aio_poll and the GSource use the same prepare/poll/dispatch scheme. This patch breaks the invariant that aio_poll(..., true) will not block the first time it returns false. This used to be fundamental for qemu_aio_flush's implementation as "while (qemu_aio_wait()) {}" but no code in QEMU relies on this invariant anymore. The return value of aio_poll() is now comparable with that of g_main_context_iteration. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-08-29 10:46:58 +01:00
Paolo Bonzini	845ca10dd0	AioContext: take bottom halves into account when computing aio_poll timeout Right now, QEMU invokes aio_bh_poll before the "poll" phase of aio_poll. It is simpler to do it afterwards and skip the "poll" phase altogether when the OS-dependent parts of AioContext are invoked from GSource. This way, AioContext behaves more similarly when used as a GSource vs. when used as stand-alone. As a start, take bottom halves into account when computing the poll timeout. If a bottom half is ready, do a non-blocking poll. As a side effect, this makes idle bottom halves work with aio_poll; an improvement, but not really an important one since they are deprecated. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-08-29 10:46:58 +01:00
Paolo Bonzini	0ceb849bd3	AioContext: speed up aio_notify In many cases, the call to event_notifier_set in aio_notify is unnecessary. In particular, if we are executing aio_dispatch, or if aio_poll is not blocking, we know that we will soon get to the next loop iteration (if necessary); the thread that hosts the AioContext's event loop does not need any nudging. The patch includes a Promela formal model that shows that this really works and does not need any further complication such as generation counts. It needs a memory barrier though. The generation counts are not needed because any change to ctx->dispatching after the memory barrier is okay for aio_notify. If it changes from zero to one, it is the right thing to skip event_notifier_set. If it changes from one to zero, the event_notifier_set is unnecessary but harmless. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2014-07-09 15:50:11 +02:00
Stefan Hajnoczi	924fe1293c	aio: fix qemu_bh_schedule() bh->ctx race condition qemu_bh_schedule() is supposed to be thread-safe at least the first time it is called. Unfortunately this is not quite true: bh->scheduled = 1; aio_notify(bh->ctx); Since another thread may run the BH callback once it has been scheduled, there is a race condition if the callback frees the BH before aio_notify(bh->ctx) has a chance to run. Reported-by: Stefan Priebe <s.priebe@profihost.ag> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Tested-by: Stefan Priebe <s.priebe@profihost.ag>	2014-06-04 09:56:06 +02:00
Stefan Hajnoczi	98563fc3ec	aio: add aio_context_acquire() and aio_context_release() It can be useful to run an AioContext from a thread which normally does not "own" the AioContext. For example, request draining can be implemented by acquiring the AioContext and looping aio_poll() until all requests have been completed. The following pattern should work: /* Event loop thread / while (running) { aio_context_acquire(ctx); aio_poll(ctx, true); aio_context_release(ctx); } / Another thread */ aio_context_acquire(ctx); bdrv_read(bs, 0x1000, buf, 1); aio_context_release(ctx); This patch implements aio_context_acquire() and aio_context_release(). Note that existing aio_poll() callers do not need to worry about acquiring and releasing - it is only needed when multiple threads will call aio_poll() on the same AioContext. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2014-03-13 14:42:24 +01:00
Alex Bligh	533a8cf350	aio / timers: aio_ctx_prepare sets timeout from AioContext timers Calculate the timeout in aio_ctx_prepare taking into account the timers attached to the AioContext. Alter aio_ctx_check similarly. Signed-off-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-08-22 19:10:28 +02:00
Alex Bligh	d5541d8680	aio / timers: Add a notify callback to QEMUTimerList Add a notify pointer to QEMUTimerList so it knows what to notify on a timer change. Signed-off-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-08-22 19:10:28 +02:00
Alex Bligh	dae21b98b9	aio / timers: Add QEMUTimerListGroup to AioContext Add a QEMUTimerListGroup each AioContext (meaning a QEMUTimerList associated with each clock is added) and delete it when the AioContext is freed. Signed-off-by: Alex Bligh <alex@alex.org.uk> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-08-22 19:10:27 +02:00
Stefan Hajnoczi	f2e5dca46b	aio: drop io_flush argument The .io_flush() handler no longer exists and has no users. Drop the io_flush argument to aio_set_fd_handler() and related functions. The AioFlushEventNotifierHandler and AioFlushHandler typedefs are no longer used and are dropped too. Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-08-19 15:52:19 +02:00
Liu Ping Fan	dcc772e2f2	QEMUBH: make AioContext's bh re-entrant BH will be used outside big lock, so introduce lock to protect between the writers, ie, bh's adders and deleter. The lock only affects the writers and bh's callback does not take this extra lock. Note that for the same AioContext, aio_bh_poll() can not run in parallel yet. Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-07-19 12:29:21 +08:00
Stefan Hajnoczi	9b34277d23	aio: add a ThreadPool instance to AioContext This patch adds a ThreadPool to AioContext. It's possible that some AioContext instances will never use the ThreadPool, so defer creation until aio_get_thread_pool(). The reason why AioContext should have the ThreadPool is because the ThreadPool is bound to a AioContext instance where the work item's callback function is invoked. It doesn't make sense to keep the ThreadPool pointer anywhere other than AioContext. For example, block/raw-posix.c can get its AioContext's ThreadPool and submit work. Special note about headers: I used struct ThreadPool in aio.h because there is a circular dependency if aio.h includes thread-pool.h. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>	2013-03-15 16:07:50 +01:00
Stefan Hajnoczi	6b5f876252	aio: convert aio_poll() to g_poll(3) AioHandler already has a GPollFD so we can directly use its events/revents. Add the int pollfds_idx field to AioContext so we can map g_poll(3) results back to AioHandlers. Reuse aio_dispatch() to invoke handlers after g_poll(3). Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Message-id: 1361356113-11049-10-git-send-email-stefanha@redhat.com Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2013-02-21 16:17:31 -06:00
Paolo Bonzini	1de7afc984	misc: move include files to include/qemu/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:32:39 +01:00
Paolo Bonzini	737e150e89	block: move include files to include/block/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:31:31 +01:00
Kevin Wolf	c57b6656c3	aio: Get rid of qemu_aio_flush() There are no remaining users, and new users should probably be using bdrv_drain_all() in the first place. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-11 11:04:25 +01:00
Paolo Bonzini	f5022a135e	aio: fix aio_ctx_prepare with idle bottom halves Commit ed2aec4867f0d5f5de496bb765347b5d0cfe113d changed the return value of aio_ctx_prepare from false to true when only idle bottom halves are available. This broke PC old-style DMA, which uses them. Fix this by making aio_ctx_prepare return true only when non-idle bottom halves are scheduled to run. Reported-by: malc <av1474@comtv.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: malc <av1474@comtv.ru>	2012-11-12 20:02:09 +04:00
Paolo Bonzini	22bfa75eaf	aio: clean up now-unused functions Some cleanups can now be made, now that the main loop does not anymore need hooks into the bottom half code. Reviewed-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-30 09:30:54 +01:00
Paolo Bonzini	2f4dc3c1b2	aio: add aio_notify With this change async.c does not rely anymore on any service from main-loop.c, i.e. it is completely self-contained. Reviewed-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-30 09:30:53 +01:00
Paolo Bonzini	e3713e001f	aio: make AioContexts GSources This lets AioContexts be used (optionally) with a glib main loop. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-30 09:30:53 +01:00
Paolo Bonzini	7c0628b20e	aio: add non-blocking variant of aio_wait This will be used when polling the GSource attached to an AioContext. Reviewed-by: Anthony Liguori <aliguori@us.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-30 09:30:53 +01:00
Paolo Bonzini	a915f4bc97	aio: add I/O handlers to the AioContext interface With this patch, I/O handlers (including event notifier handlers) can be attached to a single AioContext. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-30 09:30:53 +01:00
Paolo Bonzini	f627aab1cc	aio: introduce AioContext, move bottom halves there Start introducing AioContext, which will let us remove globals from aio.c/async.c, and introduce multiple I/O threads. The bottom half functions now take an additional AioContext argument. A bottom half is created with a specific AioContext that remains the same throughout the lifetime. qemu_bh_new is just a wrapper that uses a global context. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-30 09:30:53 +01:00
Stefan Weil	9b47b17e80	async: Use bool for boolean struct members and remove a hole Using bool reduces the size of the structure and improves readability. A hole in the structure was removed. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2012-05-01 10:13:25 +01:00
Stefano Stabellini	7c7db75576	main_loop_wait: block indefinitely - remove qemu_calculate_timeout; - explicitly size timeout to uint32_t; - introduce slirp_update_timeout; - pass NULL as timeout argument to select in case timeout is the maximum value; Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Acked-by: Paul Brook <paul@codesourcery.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2012-04-26 13:14:58 -05:00
Paolo Bonzini	44a9b356ad	main-loop: create main-loop.h Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2011-10-21 18:14:30 +02:00
Kevin Wolf	648fb0ea5e	async: Allow nested qemu_bh_poll calls qemu may segfault when a BH handler first deletes a BH and then (possibly indirectly) calls a nested qemu_bh_poll(). This is because the inner instance frees the BH and deletes it from the list that the outer one processes. This patch deletes BHs only in the outermost qemu_bh_poll instance. Commit `7887f620` already tried to achieve the same, but it assumed that the BH handler would only delete its own BH. With a nested qemu_bh_poll(), this isn't guaranteed, so that commit wasn't enough. Hope this one fixes it for real. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-09-06 11:23:51 +02:00
Anthony Liguori	7267c0947d	Use glib memory allocation and free functions qemu_malloc/qemu_free no longer exist after this commit. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2011-08-20 23:01:08 -05:00
Kevin Wolf	384acbf46b	async: Remove AsyncContext The purpose of AsyncContexts was to protect qcow and qcow2 against reentrancy during an emulated bdrv_read/write (which includes a qemu_aio_wait() call and can run AIO callbacks of different requests if it weren't for AsyncContexts). Now both qcow and qcow2 are protected by CoMutexes and AsyncContexts can be removed. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2011-08-02 15:53:41 +02:00
Kevin Wolf	7887f6201f	Allow nested qemu_bh_poll() after BH deletion Without this, qemu segfaults when a BH handler first deletes its BH and then calls another function which involves a nested qemu_bh_poll() call. This can be reproduced by generating an I/O error (e.g. with blkdebug) on an IDE device and using rerror/werror=stop to stop the VM. When continuing the VM, qemu segfaults. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>	2011-06-15 15:43:20 +02:00
Kevin Wolf	9a1e948129	Introduce contexts for asynchronous callbacks Add the possibility to use AIO and BHs without allowing foreign callbacks to be run. Basically, you put your own AIOs and BHs in a separate context. For details see the comments in the source. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-10-27 12:28:59 -05:00
Kevin Wolf	4f999d05f5	Split out bottom halves Instead of putting more and more stuff into vl.c, let's have the generic functions that deal with asynchronous callbacks in their own file. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2009-10-27 12:28:59 -05:00

39 Commits