Commit Graph

1110 Commits

Author SHA1 Message Date
Stefan Hajnoczi ae60ab7eb2 aio-posix: fix test-aio /aio/event/wait with fdmon-io_uring
When a file descriptor becomes ready we must re-arm POLL_ADD.  This is
done by adding an sqe to the io_uring sq ring.  The ->need_wait()
function wasn't taking pending sqes into account and therefore
io_uring_submit_and_wait() was not being called.  Polling for cqes
failed to detect fd readiness since we hadn't submitted the sqe to
io_uring.

This patch fixes the following tests/test-aio -p /aio/event/wait
failure:

  ok 11 /aio/event/wait
  **
  ERROR:tests/test-aio.c:374:test_flush_event_notifier: assertion failed: (aio_poll(ctx, false))

Reported-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Tested-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200402145434.99349-1-stefanha@redhat.com
Fixes: 73fd282e7b
       ("aio-posix: add io_uring fd monitoring implementation")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-04-03 12:42:40 +01:00
Robert Hoo 8f13a39dc0 util/bufferiszero: improve avx2 accelerator
By increasing avx2 length_to_accel to 128, we can simplify its logic and reduce a
branch.

The authorship of this patch actually belongs to Richard Henderson
<richard.henderson@linaro.org>, I just fixed a boundary case on his
original patch.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1585119021-46593-2-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-01 14:24:03 -04:00
Robert Hoo b87c99d073 util/bufferiszero: assign length_to_accel value for each accelerator case
Because in unit test, init_accel() will be called several times, each with
different accelerator type.

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1585119021-46593-1-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-01 14:24:03 -04:00
Stefan Hajnoczi ff807d5592 aio-posix: fix io_uring with external events
When external event sources are disabled fdmon-io_uring falls back to
fdmon-poll.  The ->need_wait() callback needs to watch for this so it
can return true when external event sources are disabled.

It is also necessary to call ->wait() when AioHandlers have changed
because io_uring is asynchronous and we must submit new sqes.

Both of these changes to ->need_wait() together fix tests/test-aio -p
/aio/external-client, which failed with:

  test-aio: tests/test-aio.c:404: test_aio_external_client: Assertion `aio_poll(ctx, false)' failed.

Reported-by: Julia Suvorova <jusual@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20200319163559.117903-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-03-23 11:05:44 +00:00
Vladimir Sementsov-Ogievskiy 299ea9ff01 block/dirty-bitmap: improve _next_dirty_area API
Firstly, _next_dirty_area is for scenarios when we may contiguously
search for next dirty area inside some limited region, so it is more
comfortable to specify "end" which should not be recalculated on each
iteration.

Secondly, let's add a possibility to limit resulting area size, not
limiting searching area. This will be used in NBD code in further
commit. (Note that now bdrv_dirty_bitmap_next_dirty_area is unused)

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20200205112041.6003-8-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
2020-03-18 14:03:46 -04:00
Vladimir Sementsov-Ogievskiy 9399c54b75 block/dirty-bitmap: add _next_dirty API
We have bdrv_dirty_bitmap_next_zero, let's add corresponding
bdrv_dirty_bitmap_next_dirty, which is more comfortable to use than
bitmap iterators in some cases.

For test modify test_hbitmap_next_zero_check_range to check both
next_zero and next_dirty and add some new checks.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20200205112041.6003-7-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
2020-03-18 14:03:46 -04:00
Vladimir Sementsov-Ogievskiy 642700fda0 block/dirty-bitmap: switch _next_dirty_area and _next_zero to int64_t
We are going to introduce bdrv_dirty_bitmap_next_dirty so that same
variable may be used to store its return value and to be its parameter,
so it would int64_t.

Similarly, we are going to refactor hbitmap_next_dirty_area to use
hbitmap_next_dirty together with hbitmap_next_zero, therefore we want
hbitmap_next_zero parameter type to be int64_t too.

So, for convenience update all parameters of *_next_zero and
*_next_dirty_area to be int64_t.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20200205112041.6003-6-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
2020-03-18 14:03:46 -04:00
Vladimir Sementsov-Ogievskiy 0c88f1970c hbitmap: drop meta bitmaps as they are unused
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20200205112041.6003-5-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
2020-03-18 14:03:46 -04:00
Vladimir Sementsov-Ogievskiy 30b8346cc3 hbitmap: unpublish hbitmap_iter_skip_words
Function is internal and even commented as internal. Drop its
definition from .h file.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20200205112041.6003-4-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
2020-03-18 14:03:46 -04:00
Vladimir Sementsov-Ogievskiy be24c7140c hbitmap: move hbitmap_iter_next_word to hbitmap.c
The function is definitely internal (it's not used by third party and
it has complicated interface). Move it to .c file.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20200205112041.6003-3-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
2020-03-18 14:03:46 -04:00
Vladimir Sementsov-Ogievskiy 6a150995d4 hbitmap: assert that we don't create bitmap larger than INT64_MAX
We have APIs which returns signed int64_t, to be able to return error.
Therefore we can't handle bitmaps with absolute size larger than
(INT64_MAX+1). Still, keep maximum to be INT64_MAX which is a bit
safer.

Note, that bitmaps are used to represent disk images, which can't
exceed INT64_MAX anyway.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20200205112041.6003-2-vsementsov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
2020-03-18 14:03:46 -04:00
Stefan Hajnoczi 3284c3ddc4 lockable: add lock guards
This patch introduces two lock guard macros that automatically unlock a
lock object (QemuMutex and others):

  void f(void) {
      QEMU_LOCK_GUARD(&mutex);
      if (!may_fail()) {
          return; /* automatically unlocks mutex */
      }
      ...
  }

and:

  WITH_QEMU_LOCK_GUARD(&mutex) {
      if (!may_fail()) {
          return; /* automatically unlocks mutex */
      }
  }
  /* automatically unlocks mutex here */
  ...

Convert qemu-timer.c functions that benefit from these macros as an
example.  Manual qemu_mutex_lock/unlock() callers are left unmodified in
cases where clarity would not improve by switching to the macros.

Many other QemuMutex users remain in the codebase that might benefit
from lock guards.  Over time they can be converted, if that is
desirable.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[Use QEMU_MAKE_LOCKABLE_NONNULL. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-17 15:18:45 +01:00
Christian Ehrhardt bd83c861c0 modules: load modules from versioned /var/run dir
On upgrades the old .so files usually are replaced. But on the other
hand since a qemu process represents a guest instance it is usually kept
around.

That makes late addition of dynamic features e.g. 'hot-attach of a ceph
disk' fail by trying to load a new version of e.f. block-rbd.so into an
old still running qemu binary.

This adds a fallback to also load modules from a versioned directory in the
temporary /var/run path. That way qemu is providing a way for packaging
to store modules of an upgraded qemu package as needed until the next reboot.

An example how that can then be used in packaging can be seen in:
https://git.launchpad.net/~paelzer/ubuntu/+source/qemu/log/?h=bug-1847361-miss-old-so-on-upgrade-UBUNTU

Fixes: https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1847361
Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20200310145806.18335-2-christian.ehrhardt@canonical.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16 23:02:22 +01:00
Paolo Bonzini 78b3f67acd oslib-posix: initialize mutex and condition variable
The mutex and condition variable were never initialized, causing
-mem-prealloc to abort with an assertion failure.

Fixes: 037fb5eb39
Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Cc: bauerchen <bauerchen@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16 23:02:22 +01:00
Robert Hoo 27f08ea1c7 util: add util function buffer_zero_avx512()
And intialize buffer_is_zero() with it, when Intel AVX512F is
available on host.

This function utilizes Intel AVX512 fundamental instructions which
is faster than its implementation with AVX2 (in my unit test, with
4K buffer, on CascadeLake SP, ~36% faster, buffer_zero_avx512() V.S.
buffer_zero_avx2()).

Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16 23:02:21 +01:00
Peter Maydell 6e8a73e911 Pull request
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAl5o3EQACgkQnKSrs4Gr
 c8jbYwgAupuS62MCITszybRE5Ote5CL80QDbMXNsZnZh6YBIGhPCqW2zdI2+m0zu
 +qoN6x6dxxNUWCvNlhbOcA45fiZVmYzS69TiEo21kCDijoK+h8+W+YbXuxGR2xJi
 cZMm8Q1DiK6Lj3vyfiwkFf4ns3VNz9DhI9hXu6CcpSkNcp79elQu87JJbzEWWWWy
 uEc7uEyBr0uCAKLEJvaLzzzWE2D2i6qKlmj3G17UbDNgCJ/Q/5HX13RUfMrgNgiP
 wmpcJ5MsB3Prz3K4XMMytUKXX/M8zpRLahp3p31t9qHelTWC3Lk1U4xzLPTJZPlm
 if/lrGRCRml+DKb9keBjWeTF4U31vg==
 =LftU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging

Pull request

# gpg: Signature made Wed 11 Mar 2020 12:40:36 GMT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  aio-posix: remove idle poll handlers to improve scalability
  aio-posix: support userspace polling of fd monitoring
  aio-posix: add io_uring fd monitoring implementation
  aio-posix: simplify FDMonOps->update() prototype
  aio-posix: extract ppoll(2) and epoll(7) fd monitoring
  aio-posix: move RCU_READ_LOCK() into run_poll_handlers()
  aio-posix: completely stop polling when disabled
  aio-posix: remove confusing QLIST_SAFE_REMOVE()
  qemu/queue.h: clear linked list pointers on remove

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-03-11 14:41:27 +00:00
Stefan Hajnoczi d37d0e365a aio-posix: remove idle poll handlers to improve scalability
When there are many poll handlers it's likely that some of them are idle
most of the time.  Remove handlers that haven't had activity recently so
that the polling loop scales better for guests with a large number of
devices.

This feature only takes effect for the Linux io_uring fd monitoring
implementation because it is capable of combining fd monitoring with
userspace polling.  The other implementations can't do that and risk
starving fds in favor of poll handlers, so don't try this optimization
when they are in use.

IOPS improves from 10k to 105k when the guest has 100
virtio-blk-pci,num-queues=32 devices and 1 virtio-blk-pci,num-queues=1
device for rw=randread,iodepth=1,bs=4k,ioengine=libaio on NVMe.

[Clarified aio_poll_handlers locking discipline explanation in comment
after discussion with Paolo Bonzini <pbonzini@redhat.com>.
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-8-stefanha@redhat.com
Message-Id: <20200305170806.1313245-8-stefanha@redhat.com>
2020-03-09 16:45:16 +00:00
Stefan Hajnoczi aa38e19f05 aio-posix: support userspace polling of fd monitoring
Unlike ppoll(2) and epoll(7), Linux io_uring completions can be polled
from userspace.  Previously userspace polling was only allowed when all
AioHandler's had an ->io_poll() callback.  This prevented starvation of
fds by userspace pollable handlers.

Add the FDMonOps->need_wait() callback that enables userspace polling
even when some AioHandlers lack ->io_poll().

For example, it's now possible to do userspace polling when a TCP/IP
socket is monitored thanks to Linux io_uring.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-7-stefanha@redhat.com
Message-Id: <20200305170806.1313245-7-stefanha@redhat.com>
2020-03-09 16:41:31 +00:00
Stefan Hajnoczi 73fd282e7b aio-posix: add io_uring fd monitoring implementation
The recent Linux io_uring API has several advantages over ppoll(2) and
epoll(2).  Details are given in the source code.

Add an io_uring implementation and make it the default on Linux.
Performance is the same as with epoll(7) but later patches add
optimizations that take advantage of io_uring.

It is necessary to change how aio_set_fd_handler() deals with deleting
AioHandlers since removing monitored file descriptors is asynchronous in
io_uring.  fdmon_io_uring_remove() marks the AioHandler deleted and
aio_set_fd_handler() will let it handle deletion in that case.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-6-stefanha@redhat.com
Message-Id: <20200305170806.1313245-6-stefanha@redhat.com>
2020-03-09 16:41:31 +00:00
Stefan Hajnoczi b321051cf4 aio-posix: simplify FDMonOps->update() prototype
The AioHandler *node, bool is_new arguments are more complicated to
think about than simply being given AioHandler *old_node, AioHandler
*new_node.

Furthermore, the new Linux io_uring file descriptor monitoring mechanism
added by the new patch requires access to both the old and the new
nodes.  Make this change now in preparation.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-5-stefanha@redhat.com
Message-Id: <20200305170806.1313245-5-stefanha@redhat.com>
2020-03-09 16:41:31 +00:00
Stefan Hajnoczi 1f050a4690 aio-posix: extract ppoll(2) and epoll(7) fd monitoring
The ppoll(2) and epoll(7) file descriptor monitoring implementations are
mixed with the core util/aio-posix.c code.  Before adding another
implementation for Linux io_uring, extract out the existing
ones so there is a clear interface and the core code is simpler.

The new interface is AioContext->fdmon_ops, a pointer to a FDMonOps
struct.  See the patch for details.

Semantic changes:
1. ppoll(2) now reflects events from pollfds[] back into AioHandlers
   while we're still on the clock for adaptive polling.  This was
   already happening for epoll(7), so if it's really an issue then we'll
   need to fix both in the future.
2. epoll(7)'s fallback to ppoll(2) while external events are disabled
   was broken when the number of fds exceeded the epoll(7) upgrade
   threshold.  I guess this code path simply wasn't tested and no one
   noticed the bug.  I didn't go out of my way to fix it but the correct
   code is simpler than preserving the bug.

I also took some liberties in removing the unnecessary
AioContext->epoll_available (just check AioContext->epollfd != -1
instead) and AioContext->epoll_enabled (it's implicit if our
AioContext->fdmon_ops callbacks are being invoked) fields.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-4-stefanha@redhat.com
Message-Id: <20200305170806.1313245-4-stefanha@redhat.com>
2020-03-09 16:41:31 +00:00
Stefan Hajnoczi 3aa221b382 aio-posix: move RCU_READ_LOCK() into run_poll_handlers()
Now that run_poll_handlers_once() is only called by run_poll_handlers()
we can improve the CPU time profile by moving the expensive
RCU_READ_LOCK() out of the polling loop.

This reduces the run_poll_handlers() from 40% CPU to 10% CPU in perf's
sampling profiler output.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-3-stefanha@redhat.com
Message-Id: <20200305170806.1313245-3-stefanha@redhat.com>
2020-03-09 16:41:31 +00:00
Stefan Hajnoczi e4346192f1 aio-posix: completely stop polling when disabled
One iteration of polling is always performed even when polling is
disabled.  This is done because:
1. Userspace polling is cheaper than making a syscall.  We might get
   lucky.
2. We must poll once more after polling has stopped in case an event
   occurred while stopping polling.

However, there are downsides:
1. Polling becomes a bottleneck when the number of event sources is very
   high.  It's more efficient to monitor fds in that case.
2. A high-frequency polling event source can starve non-polling event
   sources because ppoll(2)/epoll(7) is never invoked.

This patch removes the forced polling iteration so that poll_ns=0 really
means no polling.

IOPS increases from 10k to 60k when the guest has 100
virtio-blk-pci,num-queues=32 devices and 1 virtio-blk-pci,num-queues=1
device because the large number of event sources being polled slows down
the event loop.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-2-stefanha@redhat.com
Message-Id: <20200305170806.1313245-2-stefanha@redhat.com>
2020-03-09 16:41:31 +00:00
Stefan Hajnoczi c39cbedb54 aio-posix: remove confusing QLIST_SAFE_REMOVE()
QLIST_SAFE_REMOVE() is confusing here because the node must be on the
list.  We actually just wanted to clear the linked list pointers when
removing it from the list.  QLIST_REMOVE() now does this, so switch to
it.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200224103406.1894923-3-stefanha@redhat.com
Message-Id: <20200224103406.1894923-3-stefanha@redhat.com>
2020-03-09 16:39:20 +00:00
Philippe Mathieu-Daudé cf0c76cd6d util/osdep: Improve error report by calling error_setg_win32()
Use error_setg_win32() which adds a hint similar to strerror(errno)).

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200228100726.8414-3-philmd@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-03-09 13:36:15 +01:00
bauerchen 037fb5eb39 mem-prealloc: optimize large guest startup
[desc]:
    Large memory VM starts slowly when using -mem-prealloc, and
    there are some areas to optimize in current method;

    1、mmap will be used to alloc threads stack during create page
    clearing threads, and it will attempt mm->mmap_sem for write
    lock, but clearing threads have hold read lock, this competition
    will cause threads createion very slow;

    2、methods of calcuating pages for per threads is not well;if we use
    64 threads to split 160 hugepage,63 threads clear 2page,1 thread
    clear 34 page,so the entire speed is very slow;

    to solve the first problem,we add a mutex in thread function,and
    start all threads when all threads finished createion;
    and the second problem, we spread remainder to other threads,in
    situation that 160 hugepage and 64 threads, there are 32 threads
    clear 3 pages,and 32 threads clear 2 pages.

[test]:
    320G 84c VM start time can be reduced to 10s
    680G 84c VM start time can be reduced to 18s

Signed-off-by: bauerchen <bauerchen@tencent.com>
Reviewed-by: Pan Rui <ruippan@tencent.com>
Reviewed-by: Ivan Ren <ivanren@tencent.com>
[Simplify computation of the number of pages per thread. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-25 09:18:01 +01:00
Alexander Bulekov 46a07579eb module: check module wasn't already initialized
The virtual-device fuzzer must initialize QOM, prior to running
vl:qemu_init, so that it can use the qos_graph to identify the arguments
required to initialize a guest for libqos-assisted fuzzing. This change
prevents errors when vl:qemu_init tries to (re)initialize the previously
initialized QOM module.

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200220041118.23264-4-alxndr@bu.edu
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi 7391d34c3c aio-posix: make AioHandler dispatch O(1) with epoll
File descriptor monitoring is O(1) with epoll(7), but
aio_dispatch_handlers() still scans all AioHandlers instead of
dispatching just those that are ready.  This makes aio_poll() O(n) with
respect to the total number of registered handlers.

Add a local ready_list to aio_poll() so that each nested aio_poll()
builds a list of handlers ready to be dispatched.  Since file descriptor
polling is level-triggered, nested aio_poll() calls also see fds that
were ready in the parent but not yet dispatched.  This guarantees that
nested aio_poll() invocations will dispatch all fds, even those that
became ready before the nested invocation.

Since only handlers ready to be dispatched are placed onto the
ready_list, the new aio_dispatch_ready_handlers() function provides O(1)
dispatch.

Note that AioContext polling is still O(n) and currently cannot be fully
disabled.  This still needs to be fixed before aio_poll() is fully O(1).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-6-stefanha@redhat.com
[Fix compilation error on macOS where there is no epoll(87).  The
aio_epoll() prototype was out of date and aio_add_ready_list() needed to
be moved outside the ifdef.
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi 4749079ce0 aio-posix: make AioHandler deletion O(1)
It is not necessary to scan all AioHandlers for deletion.  Keep a list
of deleted handlers instead of scanning the full list of all handlers.

The AioHandler->deleted field can be dropped.  Let's check if the
handler has been inserted into the deleted list instead.  Add a new
QLIST_IS_INSERTED() API for this check.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi ca8c6b2275 aio-posix: don't pass ns timeout to epoll_wait()
Don't pass the nanosecond timeout into epoll_wait(), which expects
milliseconds.

The epoll_wait() timeout value does not matter if qemu_poll_ns()
determined that the poll fd is ready, but passing a value in the wrong
units is still ugly.  Pass a 0 timeout to epoll_wait() instead.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi ff29ed3a33 aio-posix: fix use after leaving scope in aio_poll()
epoll_handler is a stack variable and must not be accessed after it goes
out of scope:

      if (aio_epoll_check_poll(ctx, pollfds, npfd, timeout)) {
          AioHandler epoll_handler;
          ...
          add_pollfd(&epoll_handler);
          ret = aio_epoll(ctx, pollfds, npfd, timeout);
      } ...

  ...

  /* if we have any readable fds, dispatch event */
  if (ret > 0) {
      for (i = 0; i < npfd; i++) {
          nodes[i]->pfd.revents = pollfds[i].revents;
      }
  }

nodes[0] is &epoll_handler, which has already gone out of scope.

There is no need to use pollfds[] for epoll.  We don't need an
AioHandler for the epoll fd.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Message-id: 20200214171712.541358-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi 8c6b0356b5 util/async: make bh_aio_poll() O(1)
The ctx->first_bh list contains all created BHs, including those that
are not scheduled.  The list is iterated by the event loop and therefore
has O(n) time complexity with respected to the number of created BHs.

Rewrite BHs so that only scheduled or deleted BHs are enqueued.
Only BHs that actually require action will be iterated.

One semantic change is required: qemu_bh_delete() enqueues the BH and
therefore invokes aio_notify().  The
tests/test-aio.c:test_source_bh_delete_from_cb() test case assumed that
g_main_context_iteration(NULL, false) returns false after
qemu_bh_delete() but it now returns true for one iteration.  Fix up the
test case.

This patch makes aio_compute_timeout() and aio_bh_poll() drop from a CPU
profile reported by perf-top(1).  Previously they combined to 9% CPU
utilization when AioContext polling is commented out and the guest has 2
virtio-blk,num-queues=1 and 99 virtio-blk,num-queues=32 devices.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200221093951.1414693-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Stefan Hajnoczi f25c0b5479 aio-posix: avoid reacquiring rcu_read_lock() when polling
The first rcu_read_lock/unlock() is expensive.  Nested calls are cheap.

This optimization increases IOPS from 73k to 162k with a Linux guest
that has 2 virtio-blk,num-queues=1 and 99 virtio-blk,num-queues=32
devices.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20200218182708.914552-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-02-22 08:26:47 +00:00
Peter Maydell a8c6af67e1 ppc patch queue 2020-02-21
Here's the next patch of ppc target patches.  Highlights are:
   * Some fixes for CAS / unplug interactions
   * Remove some leaks of device trees
   * Some fixes for the PHB3 and PHB4 devices
   * Support for NVDIMMs on the pseries machine type
   * Assorted other fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl5PUAwACgkQbDjKyiDZ
 s5KBiBAAzUs/K1ygQuvcNTChOeu+3rBMe5XWRSRPIhqIZWnebssptwiMfqjVQXHB
 1unLmmkUU8ov9cXWiRiAF/mrmBX6Kw2UbnNzjatvB7L1BX8b3Z848vYKb0mTZG6g
 Q7E8nyqQqmWUtzeF6B0sNf597oPalHO5j6r5e/Pl5hjZGkjDXihB5tKePBRucnOO
 wCsWSsoafaBJwq2w8KBahfgLYdQdCW2G2GHcfGouobzravEuOaxpknoLJiUw65Wa
 Gf6PFrJE5XxDbn+zxkvs6RvPnNMkfFxn0fYbmPghjAM58Unz2LGyf9RRd5b6beuu
 fYsoZPxX+gOFVjBBwjTmj+VxSL1Fw+b2xOrd1uf7zVNyJNO8AsxRWQMD7bLFTGSZ
 95KkP5wSjcw2sm8eTtKkUAOP9YcTBJP1mjfjX/9rSGASCoOv6DJ71YOtyI4pzh5S
 DYLYm2EL3A0BxqxLodSx4xcTrh3IHAUXcteYr+ndGOy7MHzZGR88gtUmuAoFcgKc
 N0xV8dRoC5gNgLLxQ7zMh9q6qQHr0n13rRFKQ075piU4ce7JUzSR+EXCfQh2HLv+
 GYM+FpZo2zFsWZxXDQE1fsLaCfz533HsplevAM2J/bGWAokYMEZKV6jV7DKXgxW0
 6SrrSdxJrEThb198j1jSKCs0vwLDPcvV1GV6/sWStHBH1totRhY=
 =hrAW
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-5.0-20200221' into staging

ppc patch queue 2020-02-21

Here's the next patch of ppc target patches.  Highlights are:
  * Some fixes for CAS / unplug interactions
  * Remove some leaks of device trees
  * Some fixes for the PHB3 and PHB4 devices
  * Support for NVDIMMs on the pseries machine type
  * Assorted other fixes and cleanups

# gpg: Signature made Fri 21 Feb 2020 03:35:40 GMT
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-5.0-20200221:
  hw/ppc/virtex_ml507:fix leak of fdevice tree blob
  spapr: Fix handling of unplugged devices during CAS and migration
  spapr: Don't use spapr_drc_needed() in CAS code
  ppc: free 'fdt' after reset the machine
  target/ppc/cpu.h: Clean up comments in the struct CPUPPCState definition
  target/ppc/cpu.h: Move fpu related members closer in cpu env
  target/ppc: Fix typo in comments
  spapr: Allow changing offset for -kernel image
  pnv/phb3: Add missing break statement
  pnv/phb4: Fix error path in pnv_pec_realize()
  pnv/phb3: Convert 1u to 1ull
  target/ppc/cpu.h: Remove duplicate includes
  spapr: Add Hcalls to support PAPR NVDIMM device
  spapr: Add NVDIMM device support
  nvdimm: add uuid property to nvdimm
  mem: move nvdimm_device_list to utilities
  ppc: function to setup latest class options
  ppc/pnv: Fix PCI_EXPRESS dependency
  qtest: Fix rtas dependencies
  spapr/rtas: Print message from "ibm,os-term"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-21 14:20:42 +00:00
Shivaprasad G Bhat 3f350f6bb3 mem: move nvdimm_device_list to utilities
nvdimm_device_list is required for parsing the list for devices
in subsequent patches. Move it to common utility area.

Signed-off-by: Shivaprasad G Bhat <sbhat@linux.ibm.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <158131055857.2897.15658377276504711773.stgit@lep8c.aus.stglabs.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-02-21 09:15:03 +11:00
Peter Maydell b651b80822 Implement membarrier, SO_RCVTIMEO and SO_SNDTIMEO
Disable by default build of fdt, slirp and tools with linux-user
 Improve strace and use qemu_log to send trace to a file
 Add partial ALSA ioctl supports
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAl5OT1QSHGxhdXJlbnRA
 dml2aWVyLmV1AAoJEPMMOL0/L748i9kQAJYbShtYQNoNhSf/joKtQbxJ6vtQPryU
 8J85wD2RShYpX2qi86WZ4IGFhSE+wFQ+Lfi1Z0SPs/VUgTSaiOYYEZBfROiKzY/M
 hbpXlfzWGOLxfzgP17QR5eoSxbVA81N7ruyTiUxFGUTeVbowIXR+/v5Ek8GOJCuu
 pAA/07NHxuP/YP2MkrlfAolFte5riQgfQT+QrJwpVwygdRwMx0Ed+gDWJosl8zL/
 qEOzyZlXI/Mm+5bKCUQknSqpL3yQibmYfqM0XLhxjfvZ4vJDYk3Vet7ww39Je9Rz
 7eAmXU7ERzzuFBTA7HEJjAjy+BrPkPMJINczfYcahUrt3TYS36DxinTre0TJbkMV
 1np1Snn4LhSbvcAY/gFpO6X+lS5siYKZC9Ki2chBfl7AygjfswSl1wqE4cNwEyma
 ycjdYnlzo8JvuPPI7qploFAPr0hDgloAVfIzTlz+LC7dhsAhHtpf0vq0xC717z98
 qOTO1MVoZa0WserrnyAFDaNkBwwYGwhm9WKcd23/9XO6gjc5yZWF98oc31Z1KpT4
 ya+qqpcy++FNM3K+74BC403pdoda75N7sl0HYsxJWnhRjUOAPjEwHRo69iGzO/ku
 8AOcjws/wf2BnoR7KO/KLQrkpc+EAGY/ulndTqUEyC1MdsMoF6VOB5aTFiBgJBMO
 GlsvTkt+05i4
 =WAH+
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-5.0-pull-request' into staging

Implement membarrier, SO_RCVTIMEO and SO_SNDTIMEO
Disable by default build of fdt, slirp and tools with linux-user
Improve strace and use qemu_log to send trace to a file
Add partial ALSA ioctl supports

# gpg: Signature made Thu 20 Feb 2020 09:20:20 GMT
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-5.0-pull-request:
  linux-user: Add support for selected alsa timer instructions using ioctls
  linux-user: Add support for getting/setting selected alsa timer parameters using ioctls
  linux-user: Add support for selecting alsa timer using ioctl
  linux-user: Add support for getting/setting specified alsa timer parameters using ioctls
  linux-user: Add support for getting alsa timer version and id
  linux-user: remove gemu_log from the linux-user tree
  linux-user: Use `qemu_log' for strace
  linux-user: Use `qemu_log' for non-strace logging
  configure: Avoid compiling system tools on user build by default
  linux-user/strace: Improve output of various syscalls
  configure: linux-user doesn't need neither fdt nor slirp
  linux-user: implement getsockopt SO_RCVTIMEO and SO_SNDTIMEO
  linux-user: Implement membarrier syscall

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-02-20 17:35:42 +00:00
Josh Kunz 4b25a50674 linux-user: Use `qemu_log' for strace
This change switches linux-user strace logging to use the newer `qemu_log`
logging subsystem rather than the older `gemu_log` (notice the "g")
logger. `qemu_log` has several advantages, namely that it allows logging
to a file, and provides a more unified interface for configuration
of logging (via the QEMU_LOG environment variable or options).

This change introduces a new log mask: `LOG_STRACE` which is used for
logging of user-mode strace messages.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Josh Kunz <jkz@google.com>
Message-Id: <20200204025416.111409-3-jkz@google.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-02-19 11:17:40 +01:00
Michal Privoznik b09d51c909 Report stringified errno in VFIO related errors
In a few places we report errno formatted as a negative integer.
This is not as user friendly as it can be. Use strerror() and/or
error_setg_errno() instead.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <4949c3ecf1a32189b8a4b5eb4b0fd04c1122501d.1581674006.git.mprivozn@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-02-18 20:20:49 +01:00
Peter Maydell a284f798f3 Remove support for CLOCK_MONOTONIC not being defined
Some older parts of QEMU's codebase assume that CLOCK_MONOTONIC
might not be defined by the host OS, and have workarounds to
deal with this. However, more recently (notably in commit
50290c002c for qemu-img in mid-2019, but also much
earlier in 2011 in commit 22795174a3 for ui/spice-display.c)
we've written code that assumes CLOCK_MONOTONIC is always
defined. The only host OS anybody's ever noticed this on
is OSX 10.11 and earlier, which we don't support.

So we can assume that all our host OSes have the #define,
and we can remove some now-unnecessary ifdefs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20200201172252.6605-1-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-12 16:23:01 +01:00
Peter Maydell 28db64fce5 Pull request
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAl4zTL4ACgkQnKSrs4Gr
 c8j38wf9EMliB2OwzQGqmKLa+yE5lTzZLe4ou7yOR5jAwt6VfpT10PxtE2WSF0no
 KSiT8fhlY+AKx0Y0v57Qis0B9iLolPLOW7iTu07dBn5eQ7+mQiKeZi5OA+FtSq6i
 cnOug0/PRRKNtCz+U0AwEOMjDmBUkWsGuQRUe4mO+Trkb/GKdBoF7NsSqNoHJpJZ
 sv2se9WImJMV1OLmtwQ94l5hzsEhz5wWZ3n4DPa7J1qYBWjzaPydjNkIyO8kgshb
 i2FcNsL9As3jhIzuCUX8zRKzTaTJVLF8e07iInWNswKBUbt7LxbqtOZWMoTavgpu
 gsbBG3Rk6l08IM0cL5/r1PDYgroIuQ==
 =/0ID
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

Pull request

# gpg: Signature made Thu 30 Jan 2020 21:38:06 GMT
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  qemu_set_log_filename: filename argument may be NULL
  hw/display/qxl.c: Use trace_event_get_state_backends()
  memory.c: Use trace_event_get_state_backends()
  docs/devel/tracing.txt: Recommend only trace_event_get_state_backends()
  Makefile: Keep trace-events-subdirs ordered

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-01-31 17:37:00 +00:00
Salvador Fandino e144a605a6 qemu_set_log_filename: filename argument may be NULL
NULL is a valid log filename used to indicate we want to use stderr
but qemu_set_log_filename (which is called by bsd-user/main.c) was not
handling it correctly.

That also made redundant a couple of NULL checks in calling code which
have been removed.

Signed-off-by: Salvador Fandino <salvador@qindel.com>
Message-Id: <20200123193626.19956-1-salvador@qindel.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-01-30 21:33:50 +00:00
Aarushi Mehta fcb7a4a4e8 util/async: add aio interfaces for io_uring
Signed-off-by: Aarushi Mehta <mehta.aaru20@gmail.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200120141858.587874-7-stefanha@redhat.com
Message-Id: <20200120141858.587874-7-stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2020-01-30 20:59:41 +00:00
Carlos Santos 00b5032ead util/cacheinfo: fix crash when compiling with uClibc
uClibc defines _SC_LEVEL1_ICACHE_LINESIZE and _SC_LEVEL1_DCACHE_LINESIZE
but the corresponding sysconf calls returns -1, which is a valid result,
meaning that the limit is indeterminate.

Handle this situation using the fallback values instead of crashing due
to an assertion failure.

Signed-off-by: Carlos Santos <casantos@redhat.com>
Message-Id: <20191017123713.30192-1-casantos@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2020-01-21 14:18:12 -10:00
Peter Maydell abd5f8bb95 Fix some uninitialized variable warnings,
some memory leak warnings and update MAINTAINERS file.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAl4V/QMSHGxhdXJlbnRA
 dml2aWVyLmV1AAoJEPMMOL0/L748jsIP/iYNeDRLcur9XqW2j8MRGS6bhPFQv6jU
 INqEqxkVpdBykcE8IeqNpY8R8tFSvPdzUcYd7n5o752XQNrXXOzsBRT6J2VVLVPG
 /t6sx+yLD78WV/n4w5KWeXHwLv4ZdR7TuZP9slGygc6XiH1PerjCfh/ol3NkqFkI
 wk9zoCL95fMB3KEkmdUrQM1aX/ANlAQNBkgDKFLU+BPZJ7LaAh/4O96ONdNOtz1d
 nJ0TDQ0IaqtrclP1bODh9J72PT2dUFOZca7JfaDiIB0rvRpDyhPV6bjxkPi5Bfoc
 pVeF7Kc2H8wRqhgMDgqQioweQJWzrNWqpyYjQXznKldg2zdq3/ZqPVhPFi1hz81x
 XngPu8LoBCSZtgEdjiHoHFB56SWJ5MBSsQmBMIH/B3iaBaxVM8qFmv0VvpNU1fWx
 /x/5/DmLyQy334Aka5ZRqcat2tD8tAWzvppKzjIT+9QNXpr+n9Kdgkbkv39eJGoe
 2eIxhuhXZDtcmC8lMd1jUoFnaJ5pv34mHgTq/RfHYmXftwPx5wERT8GFUU/bQPCR
 mRtbXEqVl6QPSXdlFKFAV5eIBcPHU7n618+rmXTQqOvTIxV8CzhTV3jByUSX66JW
 PF9U+p4IYG/SrWdYFJH16pcyIPqU87cO8qNd7+HWHiVJvNHc4Yr5nR9I7rL0nZ4Q
 SKkUYkqlfcs3
 =An/0
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/trivial-branch-pull-request' into staging

Fix some uninitialized variable warnings,
some memory leak warnings and update MAINTAINERS file.

# gpg: Signature made Wed 08 Jan 2020 16:02:11 GMT
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/trivial-branch-pull-request:
  vl: fix memory leak in configure_accelerators
  arm/translate-a64: fix uninitialized variable warning
  nbd: fix uninitialized variable warning
  util/module: fix a memory leak
  MAINTAINERS: Update Yuval Shaia's email address

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-01-13 09:50:48 +00:00
Peter Maydell b952544fe8 * Compat machines fix (Denis)
* Command line parsing fixes (Michal, Peter, Xiaoyao)
 * Cooperlake CPU model fixes (Xiaoyao)
 * i386 gdb fix (mkdolata)
 * IOEventHandler cleanup (Philippe)
 * icount fix (Pavel)
 * RR support for random number sources (Pavel)
 * Kconfig fixes (Philippe)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJeFbG8AAoJEL/70l94x66DCpMIAKBwxBL+VegqI+ySKgmtIBQX
 LtU+ardEeZ37VfWfvuWzTFe+zQ0hsFpz/e0LHE7Ae+LVLMNWXixlmMrTIm+Xs762
 hJzxBjhUhkdrMioVYTY16Kqap4Nqaxu70gDQ32Ve2sY6xYGxYLSaJooBOU5bXVgb
 HPspHFVpeP6ZshBd1n2LXsgURE6v3AjTwqcsPCkL/AESFdkdOsoHeXjyKWJG1oPy
 W7btzlUEqVsauZI8/PhhW/8hZUvUsJVHonYLTZTyy8aklU7aOILSyT2uPXFBVUVQ
 irkQjLtD4dWlogBKO4i/QHMuwV+Asa57WNPmqv3EcIWPUWmTY84H0g2AxRgcc2M=
 =48jx
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Compat machines fix (Denis)
* Command line parsing fixes (Michal, Peter, Xiaoyao)
* Cooperlake CPU model fixes (Xiaoyao)
* i386 gdb fix (mkdolata)
* IOEventHandler cleanup (Philippe)
* icount fix (Pavel)
* RR support for random number sources (Pavel)
* Kconfig fixes (Philippe)

# gpg: Signature made Wed 08 Jan 2020 10:41:00 GMT
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (38 commits)
  chardev: Use QEMUChrEvent enum in IOEventHandler typedef
  chardev: use QEMUChrEvent instead of int
  chardev/char: Explicit we ignore some QEMUChrEvent in IOEventHandler
  monitor/hmp: Explicit we ignore a QEMUChrEvent in IOEventHandler
  monitor/qmp: Explicit we ignore few QEMUChrEvent in IOEventHandler
  virtio-console: Explicit we ignore some QEMUChrEvent in IOEventHandler
  vhost-user-blk: Explicit we ignore few QEMUChrEvent in IOEventHandler
  vhost-user-net: Explicit we ignore few QEMUChrEvent in IOEventHandler
  vhost-user-crypto: Explicit we ignore some QEMUChrEvent in IOEventHandler
  ccid-card-passthru: Explicit we ignore QEMUChrEvent in IOEventHandler
  hw/usb/redirect: Explicit we ignore few QEMUChrEvent in IOEventHandler
  hw/usb/dev-serial: Explicit we ignore few QEMUChrEvent in IOEventHandler
  hw/char/terminal3270: Explicit ignored QEMUChrEvent in IOEventHandler
  hw/ipmi: Explicit we ignore some QEMUChrEvent in IOEventHandler
  hw/ipmi: Remove unnecessary declarations
  target/i386: Add missed features to Cooperlake CPU model
  target/i386: Add new bit definitions of MSR_IA32_ARCH_CAPABILITIES
  target/i386: Fix handling of k_gs_base register in 32-bit mode in gdbstub
  hw/rtc/mc146818: Add missing dependency on ISA Bus
  hw/nvram/Kconfig: Restrict CHRP NVRAM to machines using OpenBIOS or SLOF
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-01-10 17:16:49 +00:00
Pan Nengyuan 638be47830 util/module: fix a memory leak
spotted by ASAN

Fixes: 81d8ccb1be
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Pan Nengyuan <pannengyuan@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1576805650-16380-1-git-send-email-pannengyuan@huawei.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-01-08 16:01:18 +01:00
Marc-André Lureau 1e419ee68f chardev: generate an internal id when none given
Internally, qemu may create chardev without ID. Those will not be
looked up with qemu_chr_find(), which prevents using qdev_prop_set_chr().

Use id_generate(), to generate an internal name (prefixed with #), so
no conflict exist with user-named chardev.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: xiaoqiang zhao <zxq_yx_007@163.com>
2020-01-07 16:50:09 +04:00
Pavel Dovgalyuk 878ec29b9c replay: record and replay random number sources
Record/replay feature of icount allows deterministic running of execution
scenarios. Some CPUs and peripheral devices read random numbers from
external sources making deterministic execution impossible.
This patch adds recording and replaying of random read operations
into guest-random module, which is used by the virtual hardware.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <157675984852.14505.15709141760677102489.stgit@pasha-Precision-3630-Tower>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-07 12:08:39 +01:00
Peter Maydell c4d1069c25 Add dbus-vmstate
Hi,
 
 With external processes or helpers participating to the VM support, it
 becomes necessary to handle their migration. Various options exist to
 transfer their state:
 1) as the VM memory, RAM or devices (we could say that's how
    vhost-user devices can be handled today, they are expected to
    restore from ring state)
 2) other "vmstate" (as with TPM emulator state blobs)
 3) left to be handled by management layer
 
 1) is not practical, since an external processes may legitimatelly
 need arbitrary state date to back a device or a service, or may not
 even have an associated device.
 
 2) needs ad-hoc code for each helper, but is simple and working
 
 3) is complicated for management layer, QEMU has the migration timing
 
 The proposed "dbus-vmstate" object will connect to a given D-Bus
 address, and save/load from org.qemu.VMState1 owners on migration.
 
 Thus helpers can easily have their state migrated with QEMU, without
 implementing ad-hoc support (such as done for TPM emulation)
 
 D-Bus is ubiquitous on Linux (it is systemd IPC), and can be made to
 work on various other OSes. There are several implementations and good
 bindings for various languages.  (the tests/dbus-vmstate-test.c is a
 good example of how simple the implementation of services can be, even
 in C)
 
 dbus-vmstate is put into use by the libvirt series "[PATCH 00/23] Use
 a slirp helper process".
 
 v2:
  - fix build with broken mingw-glib
 -----BEGIN PGP SIGNATURE-----
 
 iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAl4TR5ccHG1hcmNhbmRy
 ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5R6EEACFTd4hDG8i/GnxCFut
 MGcTusJr+2IklIT/K0qpLf0axNUoIqycwv8m0T9QhoG8h+9lMykOd1YJpNetT5qK
 gifOF2gcPK/9WIdFbX7dLSUAWpzO6fG/RzKK65Nc1uJSnXlb8JV0BU/6FrfCE+3U
 Bg5PvVtxxtwejQfQPOI7bPxOqxr/SmjUGcbFgacMAMG0Lm/VG/92kdoC6Z4Xf/bd
 FcAeiO2CiPoGXG5zD4WF1emwxnSu65PgcFpSpqvvFlmDbYlTwoMt4VWxTfkAzbAM
 IES7j2IbhUEe3p0hvMTqmmsmds1QNCBgnQI/LtQiXPTnbfpBcZ0wT6QsSZXWvHz8
 ClA9OAimxyELblTGjD9vsi3G5m2DQS+NdfPOX7hfHouVQzDJJaS8jxDItpPgXwSO
 fZ9mUO8ps3N2YTakuKNBP/IzDOuyExrBg80GF+HbEc59Uhj8Yq/awyz1XsqjQzVP
 54+TUjwC8HZxVWgMeqiJ1njPTfRJo6uAnguLbfAXj8P9vaXLtsy/3JGsmKiziXXW
 XzvQDzhfOMjm7Uo7vN7Hp3X/UYJxnaQ3dViqZnv/gqG6yv+igVlqyrTx2IBhN2NW
 DZt3c7VqVUBYFShLgfy0zDjzM/s7mFkQKCFHUsBqIwODugYEc3TTdAa60QYjX5i9
 negngax45KM6nF3tq74fJpwWVw==
 =M4kD
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/elmarco/tags/dbus-vmstate7-pull-request' into staging

Add dbus-vmstate

Hi,

With external processes or helpers participating to the VM support, it
becomes necessary to handle their migration. Various options exist to
transfer their state:
1) as the VM memory, RAM or devices (we could say that's how
   vhost-user devices can be handled today, they are expected to
   restore from ring state)
2) other "vmstate" (as with TPM emulator state blobs)
3) left to be handled by management layer

1) is not practical, since an external processes may legitimatelly
need arbitrary state date to back a device or a service, or may not
even have an associated device.

2) needs ad-hoc code for each helper, but is simple and working

3) is complicated for management layer, QEMU has the migration timing

The proposed "dbus-vmstate" object will connect to a given D-Bus
address, and save/load from org.qemu.VMState1 owners on migration.

Thus helpers can easily have their state migrated with QEMU, without
implementing ad-hoc support (such as done for TPM emulation)

D-Bus is ubiquitous on Linux (it is systemd IPC), and can be made to
work on various other OSes. There are several implementations and good
bindings for various languages.  (the tests/dbus-vmstate-test.c is a
good example of how simple the implementation of services can be, even
in C)

dbus-vmstate is put into use by the libvirt series "[PATCH 00/23] Use
a slirp helper process".

v2:
 - fix build with broken mingw-glib

# gpg: Signature made Mon 06 Jan 2020 14:43:35 GMT
# gpg:                using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg:                issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/elmarco/tags/dbus-vmstate7-pull-request:
  tests: add dbus-vmstate-test
  tests: add migration-helpers unit
  dockerfiles: add dbus-daemon to some of latest distributions
  configure: add GDBUS_CODEGEN
  Add dbus-vmstate object
  util: add dbus helper unit
  docs: start a document to describe D-Bus usage
  vmstate: replace DeviceState with VMStateIf
  vmstate: add qom interface to get id

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-01-06 18:22:42 +00:00
Marc-André Lureau a5021d6991 util: add dbus helper unit
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2020-01-06 18:41:32 +04:00