Commit Graph

637 Commits

Author SHA1 Message Date
Paolo Bonzini
53fabd4b86 qemu-ga: obey LISTEN_PID when using systemd socket activation
qemu-ga's socket activation support was not obeying the LISTEN_PID
environment variable, which avoids that a process uses a socket-activation
file descriptor meant for its parent.

Mess can for example ensue if a process forks a children before consuming
the socket-activation file descriptor and therefore setting O_CLOEXEC
on it.

Luckily, qemu-nbd also got socket activation code, and its copy does
support LISTEN_PID.  Some extra fixups are needed to ensure that the
code can be used for both, but that's what this patch does.  The
main change is to replace get_listen_fds's "consume" argument with
the FIRST_SOCKET_ACTIVATION_FD macro from the qemu-nbd code.

Cc: "Richard W.M. Jones" <rjones@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-19 11:12:12 +01:00
Paolo Bonzini
6b8f0187a4 icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
icount has become much slower after tcg_cpu_exec has stopped
using the BQL.  There is also a latent bug that is masked by
the slowness.

The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL
timer now has to wake up the I/O thread and wait for it.  The rendez-vous
is mediated by the BQL QemuMutex:

- handle_icount_deadline wakes up the I/O thread with BQL taken
- the I/O thread wakes up and waits on the BQL
- the VCPU thread releases the BQL a little later
- the I/O thread raises an interrupt, which calls qemu_cpu_kick
- the VCPU thread notices the interrupt, takes the BQL to
  process it and waits on it

All this back and forth is extremely expensive, causing a 6 to 8-fold
slowdown when icount is turned on.

One may think that the issue is that the VCPU thread is too dependent
on the BQL, but then the latent bug comes in.  I first tried removing
the BQL completely from the x86 cpu_exec, only to see everything break.
The only way to fix it (and make everything slow again) was to add a dummy
BQL lock/unlock pair.

This is because in -icount mode you really have to process the events
before the CPU restarts executing the next instruction.  Therefore, this
series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in
the vCPU thread when running in icount mode.

The required changes include:

- make the timer notification callback wake up TCG's single vCPU thread
  when run from another thread.  By using async_run_on_cpu, the callback
  can override all_cpu_threads_idle() when the CPU is halted.

- move handle_icount_deadline after qemu_tcg_wait_io_event, so that
  the timer notification callback is invoked after the dummy work item
  wakes up the vCPU thread

- make handle_icount_deadline run the timers instead of just waking the
  I/O thread.

- stop processing the timers in the main loop

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:51:34 +01:00
Paolo Bonzini
3f53bc61a4 cpus: define QEMUTimerListNotifyCB for QEMU system emulation
There is no change for now, because the callback just invokes
qemu_notify_event.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:28:29 +01:00
Paolo Bonzini
d2528bdc19 qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
This dependency is the wrong way, and we will need util/qemu-timer.h from
sysemu/cpus.h in the next patch.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:28:18 +01:00
Jitendra Kolhe
1e356fc14b mem-prealloc: reduce large guest start-up and migration time.
Using "-mem-prealloc" option for a large guest leads to higher guest
start-up and migration time. This is because with "-mem-prealloc" option
qemu tries to map every guest page (create address translations), and
make sure the pages are available during runtime. virsh/libvirt by
default, seems to use "-mem-prealloc" option in case the guest is
configured to use huge pages. The patch tries to map all guest pages
simultaneously by spawning multiple threads. Currently limiting the
change to QEMU library functions on POSIX compliant host only, as we are
not sure if the problem exists on win32. Below are some stats with
"-mem-prealloc" option for guest configured to use huge pages.

------------------------------------------------------------------------
Idle Guest      | Start-up time | Migration time
------------------------------------------------------------------------
Guest stats with 2M HugePage usage - single threaded (existing code)
------------------------------------------------------------------------
64 Core - 4TB   | 54m11.796s    | 75m43.843s
64 Core - 1TB   | 8m56.576s     | 14m29.049s
64 Core - 256GB | 2m11.245s     | 3m26.598s
------------------------------------------------------------------------
Guest stats with 2M HugePage usage - map guest pages using 8 threads
------------------------------------------------------------------------
64 Core - 4TB   | 5m1.027s      | 34m10.565s
64 Core - 1TB   | 1m10.366s     | 8m28.188s
64 Core - 256GB | 0m19.040s     | 2m10.148s
-----------------------------------------------------------------------
Guest stats with 2M HugePage usage - map guest pages using 16 threads
-----------------------------------------------------------------------
64 Core - 4TB   | 1m58.970s     | 31m43.400s
64 Core - 1TB   | 0m39.885s     | 7m55.289s
64 Core - 256GB | 0m11.960s     | 2m0.135s
-----------------------------------------------------------------------

Changed in v2:
 - modify number of memset threads spawned to min(smp_cpus, 16).
 - removed 64GB memory restriction for spawning memset threads.

Changed in v3:
 - limit number of threads spawned based on
   min(sysconf(_SC_NPROCESSORS_ONLN), 16, smp_cpus)
 - implement memset thread specific siglongjmp in SIGBUS signal_handler.

Changed in v4
 - remove sigsetjmp/siglongjmp and SIGBUS unblock/block for main thread
   as main thread no longer touches any pages.
 - simplify code my returning memset_thread_failed status from
   touch_all_pages.

Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com>
Message-Id: <1487907103-32350-1-git-send-email-jitendra.kolhe@hpe.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Markus Armbruster
d454dbe0ee keyval: New keyval_parse()
keyval_parse() parses KEY=VALUE,... into a QDict.  Works like
qemu_opts_parse(), except:

* Returns a QDict instead of a QemuOpts (d'oh).

* Supports nesting, unlike QemuOpts: a KEY is split into key
  fragments at '.' (dotted key convention; the block layer does
  something similar on top of QemuOpts).  The key fragments are QDict
  keys, and the last one's value is updated to VALUE.

* Each key fragment may be up to 127 bytes long.  qemu_opts_parse()
  limits the entire key to 127 bytes.

* Overlong key fragments are rejected.  qemu_opts_parse() silently
  truncates them.

* Empty key fragments are rejected.  qemu_opts_parse() happily
  accepts empty keys.

* It does not store the returned value.  qemu_opts_parse() stores it
  in the QemuOptsList.

* It does not treat parameter "id" specially.  qemu_opts_parse()
  ignores all but the first "id", and fails when its value isn't
  id_wellformed(), or duplicate (a QemuOpts with the same ID is
  already stored).  It also screws up when a value contains ",id=".

* Implied value is not supported.  qemu_opts_parse() desugars "foo" to
  "foo=on", and "nofoo" to "foo=off".

* An implied key's value can't be empty, and can't contain ','.

I intend to grow this into a saner replacement for QemuOpts.  It'll
take time, though.

Note: keyval_parse() provides no way to do lists, and its key syntax
is incompatible with the __RFQDN_ prefix convention for downstream
extensions, because it blindly splits at '.', even in __RFQDN_.  Both
issues will be addressed later in the series.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1488317230-26248-4-git-send-email-armbru@redhat.com>
2017-03-07 16:07:46 +01:00
Markus Armbruster
0587568780 qmp: Dumb down how we run QMP command registration
The way we get QMP commands registered is high tech:

* qapi-commands.py generates qmp_init_marshal() that does the actual work

* it also generates the magic to register it as a MODULE_INIT_QAPI
  function, so it runs when someone calls
  module_call_init(MODULE_INIT_QAPI)

* main() calls module_call_init()

QEMU needs to register a few non-qapified commands.  Same high tech
works: monitor.c has its own qmp_init_marshal() along with the magic
to make it run in module_call_init(MODULE_INIT_QAPI).

QEMU also needs to unregister commands that are not wanted in this
build's configuration (commit 5032a16).  Simple enough:
qmp_unregister_commands_hack().  The difficulty is to make it run
after the generated qmp_init_marshal().  We can't simply run it in
monitor.c's qmp_init_marshal(), because the order in which the
registered functions run is indeterminate.  So qmp_init_marshal()
registers qmp_unregister_commands_hack() separately.  Since
registering *appends* to the list of registered functions, this will
make it run after all the functions that have been registered already.

I suspect it takes a long and expensive computer science education to
not find this silly.

Dumb it down as follows:

* Drop MODULE_INIT_QAPI entirely

* Give the generated qmp_init_marshal() external linkage.

* Call it instead of module_call_init(MODULE_INIT_QAPI)

* Except in QEMU proper, call new monitor_init_qmp_commands() that in
  turn calls the generated qmp_init_marshal(), registers the
  additional commands and unregisters the unwanted ones.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-5-git-send-email-armbru@redhat.com>
2017-03-05 09:02:10 +01:00
Peter Maydell
17783ac828 ppc patch queuye for 2017-03-03
This will probably be my last pull request before the hard freeze.  It
 has some new work, but that has all been posted in draft before the
 soft freeze, so I think it's reasonable to include in qemu-2.9.
 
 This batch has:
     * A substantial amount of POWER9 work
         * Implements the legacy (hash) MMU for POWER9
 	* Some more preliminaries for implementing the POWER9 radix
           MMU
 	* POWER9 has_work
 	* Basic POWER9 compatibility mode handling
 	* Removal of some premature tests
     * Some cleanups and fixes to the existing MMU code to make the
       POWER9 work simpler
     * A bugfix for TCG multiply adds on power
     * Allow pseries guests to access PCIe extended config space
 
 This also includes a code-motion not strictly in ppc code - moving
 getrampagesize() from ppc code to exec.c.  This will make some future
 VFIO improvements easier, Paolo said it was ok to merge via my tree.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYuOEEAAoJEGw4ysog2bOSHD0P/jBg/qr/4KnsB1KhnlVrB2sP
 vy2d3bGGlUWr9Z+CK/PMCRB8ekFgQLjidLIXji6mviUocv6m3WsVrnbLF/oOL/IT
 NPMVAffw7q804YVu1Ns9R82d6CIqHTy//bpg69tFMcJmhL9fqPan3wTZZ9JeiyAm
 SikqkAHBSW4SxKqg8ApaSqx5L2QTqyfkClR0sLmgM0JtmfJrbobpQ6bMtdPjUZ9L
 n2gnpO2vaWCa1SEQrRrdELqvcD8PHkSJapWOBXOkpGWxoeov/PYxOgkpdDUW4qYY
 lVLtp1Vd3OB/h3Unqfw32DNiHA5p89hWPX5UybKMgRVL9Cv2/lyY47pcY8XTeNzn
 bv84YRbFJeI+GgoEnghmtq+IM8XiW/cr9rWm9wATKfKGcmmFauumALrsffUpHVCM
 4hSNgBv5t2V9ptZ+MDlM/Ku+zk9GoqwQ+hemdpVtiyhOtGUPGFBn5YLE4c2DHFxV
 +L9JtBnFn8obnssNoz0wL+QvZchT1qUHMhH5CWAanjw9CTDp/YwQ2P01zK+00s9d
 4cB7fUG3WNto5eXXEGMaXeDsUEz8z//hTe3j5sVbnHsXi0R3dhv7iryifmx4bUKU
 H9EwAc+uNUHbvBy7u6IWg0I8P2n00CCO6JqXijQ92zELJ5j0XhzHUI2dOXn+zyEo
 3FZu56LFnSSUBEXuTjq4
 =PcNw
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170303' into staging

ppc patch queuye for 2017-03-03

This will probably be my last pull request before the hard freeze.  It
has some new work, but that has all been posted in draft before the
soft freeze, so I think it's reasonable to include in qemu-2.9.

This batch has:
    * A substantial amount of POWER9 work
        * Implements the legacy (hash) MMU for POWER9
	* Some more preliminaries for implementing the POWER9 radix
          MMU
	* POWER9 has_work
	* Basic POWER9 compatibility mode handling
	* Removal of some premature tests
    * Some cleanups and fixes to the existing MMU code to make the
      POWER9 work simpler
    * A bugfix for TCG multiply adds on power
    * Allow pseries guests to access PCIe extended config space

This also includes a code-motion not strictly in ppc code - moving
getrampagesize() from ppc code to exec.c.  This will make some future
VFIO improvements easier, Paolo said it was ok to merge via my tree.

# gpg: Signature made Fri 03 Mar 2017 03:20:36 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170303:
  target/ppc: rewrite f[n]m[add,sub] using float64_muladd
  spapr: Small cleanup of PPC MMU enums
  spapr_pci: Advertise access to PCIe extended config space
  target/ppc: Rework hash mmu page fault code and add defines for clarity
  target/ppc: Move no-execute and guarded page checking into new function
  target/ppc: Add execute permission checking to access authority check
  target/ppc: Add Instruction Authority Mask Register Check
  hw/ppc/spapr: Add POWER9 to pseries cpu models
  target/ppc/POWER9: Add cpu_has_work function for POWER9
  target/ppc/POWER9: Add POWER9 pa-features definition
  target/ppc/POWER9: Add POWER9 mmu fault handler
  target/ppc: Don't gen an SDR1 on POWER9 and rework register creation
  target/ppc: Add patb_entry to sPAPRMachineState
  target/ppc/POWER9: Add POWERPC_MMU_V3 bit
  powernv: Don't test POWER9 CPU yet
  exec, kvm, target-ppc: Move getrampagesize() to common code
  target/ppc: Add POWER9/ISAv3.00 to compat_table

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-04 16:31:14 +00:00
Paolo Bonzini
a16fc07ebd cpus: reorganize signal handling code
Move the KVM "eat signals" code under CONFIG_LINUX, in preparation
for moving it to kvm-all.c; reraise non-MCE SIGBUS immediately,
without passing it to KVM.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
d98d407234 cpus: remove ugly cast on sigbus_handler
The cast is there because sigbus_handler is invoked via sigfd_handler.
But it feels just wrong to use struct qemu_signalfd_siginfo in the
prototype of a function that is passed to sigaction.

Instead, do a simple-minded conversion of qemu_signalfd_siginfo to
siginfo_t.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Alexey Kardashevskiy
9c60766887 exec, kvm, target-ppc: Move getrampagesize() to common code
getrampagesize() returns the largest supported page size and mainly
used to know if huge pages are enabled.

However is implemented in target-ppc/kvm.c and not available
in TCG or other architectures.

This renames and moves gethugepagesize() to mmap-alloc.c where
fd-based analog of it is already implemented. This renames and moves
getrampagesize() to exec.c as it seems to be the common place for
helpers like this.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Marc-André Lureau
e703dcbaee timer: use an inline function for free
Similarly to allocation, do it from an inline function. This allows
tests to only use the headers for allocation/free of timer.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-01 00:09:28 +04:00
Pradeep Jagadeesh
a2a7862ca9 throttle: factor out duplicate code
This patch removes the redundant throttle code that was present in
block and fsdev device files. Now the common code is moved
to a single file.

Signed-off-by: Pradeep Jagadeesh <pradeep.jagadeesh@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
(fix indent nit, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
2017-02-28 10:31:46 +01:00
Markus Armbruster
f46bfdbfc8 util/cutils: Change qemu_strtosz*() from int64_t to uint64_t
This will permit its use in parse_option_size().

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com> (maintainer:X86)
Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core)
Cc: Max Reitz <mreitz@redhat.com> (supporter:Block layer core)
Cc: qemu-block@nongnu.org (open list:Block layer core)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1487708048-2131-24-git-send-email-armbru@redhat.com>
2017-02-23 20:35:36 +01:00
Markus Armbruster
f17fd4fdf0 util/cutils: Return qemu_strtosz*() error and value separately
This makes qemu_strtosz(), qemu_strtosz_mebi() and
qemu_strtosz_metric() similar to qemu_strtoi64(), except negative
values are rejected.

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com> (maintainer:X86)
Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core)
Cc: Max Reitz <mreitz@redhat.com> (supporter:Block layer core)
Cc: qemu-block@nongnu.org (open list:Block layer core)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1487708048-2131-23-git-send-email-armbru@redhat.com>
2017-02-23 20:35:36 +01:00
Markus Armbruster
466dea14e6 util/cutils: New qemu_strtosz()
Most callers of qemu_strtosz_suffix() pass QEMU_STRTOSZ_DEFSUFFIX_B.
Capture the pattern in new qemu_strtosz().

Inline qemu_strtosz_suffix() into its only remaining caller.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1487708048-2131-17-git-send-email-armbru@redhat.com>
2017-02-23 20:35:36 +01:00
Markus Armbruster
e591591b32 util/cutils: Rename qemu_strtosz() to qemu_strtosz_MiB()
With qemu_strtosz(), no suffix means mebibytes.  It's used rarely.
I'm going to add a similar function where no suffix means bytes.
Rename qemu_strtosz() to qemu_strtosz_MiB() to make the name
qemu_strtosz() available for the new function.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1487708048-2131-16-git-send-email-armbru@redhat.com>
2017-02-23 20:35:36 +01:00
Markus Armbruster
d2734d2629 util/cutils: New qemu_strtosz_metric()
To parse numbers with metric suffixes, we use

    qemu_strtosz_suffix_unit(nptr, &eptr, QEMU_STRTOSZ_DEFSUFFIX_B, 1000)

Capture this in a new function for legibility:

    qemu_strtosz_metric(nptr, &eptr)

Replace test_qemu_strtosz_suffix_unit() by test_qemu_strtosz_metric().

Rename qemu_strtosz_suffix_unit() to do_strtosz() and give it internal
linkage.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1487708048-2131-15-git-send-email-armbru@redhat.com>
2017-02-23 20:35:36 +01:00
Markus Armbruster
b30d188677 util/cutils: Rename qemu_strtoll(), qemu_strtoull()
The name qemu_strtoll() suggests conversion to long long, but it
actually converts to int64_t.  Rename to qemu_strtoi64().

The name qemu_strtoull() suggests conversion to unsigned long long,
but it actually converts to uint64_t.  Rename to qemu_strtou64().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1487708048-2131-7-git-send-email-armbru@redhat.com>
2017-02-23 20:35:35 +01:00
Paolo Bonzini
a7b91d35ba coroutine-lock: make CoRwlock thread-safe and fair
This adds a CoMutex around the existing CoQueue.  Because the write-side
can just take CoMutex, the old "writer" field is not necessary anymore.
Instead of removing it altogether, count the number of pending writers
during a read-side critical section and forbid further readers from
entering.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-7-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:39:40 +00:00
Paolo Bonzini
1ace7ceac5 coroutine-lock: add mutex argument to CoQueue APIs
All that CoQueue needs in order to become thread-safe is help
from an external mutex.  Add this to the API.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-6-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:39:40 +00:00
Paolo Bonzini
f8c6e1cbc3 coroutine-lock: place CoMutex before CoQueue in header
This will avoid forward references in the next patch.  It is also
more logical because CoQueue is not anymore the basic primitive.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-5-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:39:40 +00:00
Paolo Bonzini
480cff6322 coroutine-lock: add limited spinning to CoMutex
Running a very small critical section on pthread_mutex_t and CoMutex
shows that pthread_mutex_t is much faster because it doesn't actually
go to sleep.  What happens is that the critical section is shorter
than the latency of entering the kernel and thus FUTEX_WAIT always
fails.  With CoMutex there is no such latency but you still want to
avoid wait and wakeup.  So introduce it artificially.

This only works with one waiters; because CoMutex is fair, it will
always have more waits and wakeups than a pthread_mutex_t.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-3-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:39:40 +00:00
Paolo Bonzini
fed20a70e3 coroutine-lock: make CoMutex thread-safe
This uses the lock-free mutex described in the paper '"Blocking without
Locking", or LFTHREADS: A lock-free thread library' by Gidenstam and
Papatriantafilou.  The same technique is used in OSv, and in fact
the code is essentially a conversion to C of OSv's code.

[Added missing coroutine_fn in tests/test-aio-multithread.c.
--Stefan]

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213181244.16297-2-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:39:40 +00:00
Paolo Bonzini
0c330a734b aio: introduce aio_co_schedule and aio_co_wake
aio_co_wake provides the infrastructure to start a coroutine on a "home"
AioContext.  It will be used by CoMutex and CoQueue, so that coroutines
don't jump from one context to another when they go to sleep on a
mutex or waitqueue.  However, it can also be used as a more efficient
alternative to one-shot bottom halves, and saves the effort of tracking
which AioContext a coroutine is running on.

aio_co_schedule is the part of aio_co_wake that starts a coroutine
on a remove AioContext, but it is also useful to implement e.g.
bdrv_set_aio_context callbacks.

The implementation of aio_co_schedule is based on a lock-free
multiple-producer, single-consumer queue.  The multiple producers use
cmpxchg to add to a LIFO stack.  The consumer (a per-AioContext bottom
half) grabs all items added so far, inverts the list to make it FIFO,
and goes through it one item at a time until it's empty.  The data
structure was inspired by OSv, which uses it in the very code we'll
"port" to QEMU for the thread-safe CoMutex.

Most of the new code is really tests.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170213135235.12274-3-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-02-21 11:14:07 +00:00
Daniel P. Berrange
e998e2090f util: add iterators for QemuOpts values
To iterate over all QemuOpts currently requires using a callback
function which is inconvenient for control flow. Add support for
using iterator functions more directly

  QemuOptsIter iter;
  QemuOpt *opt;

  qemu_opts_iter_init(&iter, opts, "repeated-key");
  while ((opt = qemu_opts_iter_next(&iter)) != NULL) {
      ....do something...
  }

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170203120649.15637-8-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-02-09 17:28:49 +01:00
Peter Maydell
5459ef3bff ppc patch queue 2017-02-02
This obsoletes ppc-for-2.9-20170112, which had a MacOS build bug.
 
 This is a long overdue ppc pull request for qemu-2.9.  It's been a
 long time coming due to some holidays and inconveniently timed
 problems with testing.  So, there's a lot in here:
 
     * More POWER9 instruction implementations for TCG
     * The simpler parts of my CPU compatibility mode cleanup
         * This changes behaviour to prefer compatibility modes over
           "raW" mode for new machine type versions
     * New "40p" machine type which is essentially a modernized and
       cleaned up "prep".  The intention is that it will replace "prep"
       once it has some more testing and polish.
     * Add pseries-2.9 machine type
     * Implement H_SIGNAL_SYS_RESET hypercall
     * Consolidate the two alternate CPU init paths in pseries by
       making it always go through CPU core objects to initialize CPU
     * A number of bugfixes and cleanups
     * Stop the guest timebase when the guest is stopped under KVM.
       This makes the guest system clock also stop when paused, which
       matches the x86 behaviour.
     * Some preliminary cleanups leading towards implementation of the
       POWER9 MMU.
 
 There are also some changes not strictly related to ppc code, but for
 its benefit:
 
     * Limit the pxi-expander-bridge (PXB) device to x86 guests only
       (it's essentially a hack to work around historical x86
       limitations)
     * Some additions to the 128-bit math in host_utils, necessary for
       some of the new instructions.
     * Revise a number of qtests and enable them for ppc
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJYko4AAAoJEGw4ysog2bOStEYQAIk0Pd6ifZzJUcTWQaR8+AZ7
 nTbzQyWtSHqSAiwBNsykJMFXV1liZVglf2e+VBsrVOwKoU50VOyVm5LspG2z1h8N
 Rxe4FGA2MA//2F3+9/AP8Oe3RdsClNCDaXAVuCFRP4xQWxqqwwasChDeS4Ph/cZq
 CXnlhKTpk9v5vSCsr64bUOSYh3RPumnQepiBgT82hOo7R+VaJ79AFbTeCYKkd0hY
 Sq8g3mg0zOX1ekNXPk1h8oZWqkoZGbqKiXgoy/evGXWURVzTSJO6VTyM65tdwWB7
 Zds77gYAYCIYKq+Iwv4iBCmo4KJofjKQcQepQUr+eGDv9syXebtp6fY0btnIS+DX
 uGzzaixZNms9r2+FAiIlKwIeQgQvl76lYEGmvBrbrgSOyA/7GAkOId0E0Ul6D5LW
 EJSwk9ZDbyE0JBEq6Bx+LClpwye+bpdScU26djQTTcWpFApIeJTyG9V6b1xwulVZ
 rw68ZvfMYxktkvhTbEtvk2O9YZI5eQStBJkmJXeOiOduiP93aiC82MM1Jp+82Q1E
 4qRVvCpGTwzF3GLFciUKAqmwfYxByo4G0/dwG8qw6WNEemLyXFHV5TkzLhgwl3kC
 gDGl5AdH4MXj8NRjuHcDiGXfePBCD578dmz4xo5ZLA2yBavxkRzM8QsEUmD8hf5w
 jhLgyKt0G2hNNtOnGOdG
 =vLVl
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170202' into staging

ppc patch queue 2017-02-02

This obsoletes ppc-for-2.9-20170112, which had a MacOS build bug.

This is a long overdue ppc pull request for qemu-2.9.  It's been a
long time coming due to some holidays and inconveniently timed
problems with testing.  So, there's a lot in here:

    * More POWER9 instruction implementations for TCG
    * The simpler parts of my CPU compatibility mode cleanup
        * This changes behaviour to prefer compatibility modes over
          "raW" mode for new machine type versions
    * New "40p" machine type which is essentially a modernized and
      cleaned up "prep".  The intention is that it will replace "prep"
      once it has some more testing and polish.
    * Add pseries-2.9 machine type
    * Implement H_SIGNAL_SYS_RESET hypercall
    * Consolidate the two alternate CPU init paths in pseries by
      making it always go through CPU core objects to initialize CPU
    * A number of bugfixes and cleanups
    * Stop the guest timebase when the guest is stopped under KVM.
      This makes the guest system clock also stop when paused, which
      matches the x86 behaviour.
    * Some preliminary cleanups leading towards implementation of the
      POWER9 MMU.

There are also some changes not strictly related to ppc code, but for
its benefit:

    * Limit the pxi-expander-bridge (PXB) device to x86 guests only
      (it's essentially a hack to work around historical x86
      limitations)
    * Some additions to the 128-bit math in host_utils, necessary for
      some of the new instructions.
    * Revise a number of qtests and enable them for ppc

# gpg: Signature made Thu 02 Feb 2017 01:40:16 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170202: (107 commits)
  hw/ppc/pnv: Use error_report instead of hw_error if a ROM file can't be found
  ppc/kvm: Handle the "family" CPU via alias instead of registering new types
  target/ppc/mmu_hash64: Fix incorrect shift value in amr calculation
  target/ppc/mmu_hash64: Fix printing unsigned as signed int
  tcg/POWER9: NOOP the cp_abort instruction
  target/ppc/debug: Print LPCR register value if register exists
  target-ppc: Add xststdc[sp, dp, qp] instructions
  target-ppc: Add xvtstdc[sp,dp] instructions
  target-ppc: Add MMU model check for booke machines
  ppc: switch to constants within BUILD_BUG_ON
  target/ppc/cpu-models: Fix/remove bad CPU aliases
  target/ppc: Remove unused POWERPC_FAMILY(POWER)
  spapr: clock should count only if vm is running
  ppc: Remove unused function cpu_ppc601_rtc_init()
  target/ppc: Add pcr_supported to POWER9 cpu class definition
  powerpc/cpu-models: rename ISAv3.00 logical PVR definition
  target-ppc: Add xvcv[hpsp, sphp] instructions
  target-ppc: Add xsmulqp instruction
  target-ppc: Add xsdivqp instruction
  target-ppc: Add xscvsdqp and xscvudqp instructions
  ...

# Conflicts:
#	hw/pci-bridge/Makefile.objs

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-02-02 18:48:06 +00:00
Michael S. Tsirkin
ed63ec0d22 ARRAY_SIZE: check that argument is an array
It's a familiar pattern: some code uses ARRAY_SIZE, then refactoring
changes the argument from an array to a pointer to a dynamically
allocated buffer.  Code keeps compiling but any ARRAY_SIZE calls now
return the size of the pointer divided by element size.

Let's add build time checks to ARRAY_SIZE before we allow more
of these in the code-base.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-02-01 03:37:17 +02:00
Michael S. Tsirkin
d757573e69 compiler: expression version of QEMU_BUILD_BUG_ON
QEMU_BUILD_BUG_ON uses a typedef in order to be safe
to use outside functions, but sometimes it's useful
to have a version that can be used within an expression.
Following what Linux does, introduce QEMU_BUILD_BUG_ON_ZERO
that return zero after checking condition at build time.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2017-02-01 03:37:17 +02:00
Michael S. Tsirkin
f291887e8e compiler: rework BUG_ON using a struct
There are theoretical concerns that some compilers might not trigger
build failures on attempts to define an array of size (x ? -1 : 1) where
x is a variable and make it a variable sized array instead. Let rewrite
using a struct with a negative bit field size instead as there are no
dynamic bit field sizes.  This is similar to what Linux does.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2017-02-01 03:37:17 +02:00
Michael S. Tsirkin
60abf0a5e0 QEMU_BUILD_BUG_ON: use __COUNTER__
Some headers use QEMU_BUILD_BUG_ON. This causes a problem
if the C file including that header happens to have
QEMU_BUILD_BUG_ON at the same line number.

Fix using a widely available extension: __COUNTER__.
If unavailable, provide a stub.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-02-01 03:37:17 +02:00
Michael S. Tsirkin
f298318284 compiler: drop ; after BUILD_BUG_ON
All users include the trailing ; anyway, let's require that -
it seems cleaner.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2017-01-31 15:57:27 +02:00
Jose Ricardo Ziviani
f539fbe337 host-utils: Implement unsigned quadword left/right shift and unit tests
Implements 128-bit left shift and right shift as well as their
testcases. By design, shift silently mods by 128, so the caller is
responsible to assert the shift range if necessary.

Left shift sets the overflow flag if any non-zero digit is shifted out.

Examples:
 ulshift(&low, &high, 250, &overflow);
 equivalent: n << 122

 urshift(&low, &high, -2);
 equivalent: n << 126

Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[dwg: Added test-shift128 to .gitignore]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-01-31 10:10:14 +11:00
Marc-André Lureau
0ec7b3e7f2 char: rename CharDriverState Chardev
Pick a uniform chardev type name.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27 18:07:59 +01:00
Max Reitz
20a579de84 hbitmap: Add hbitmap_is_serializable()
Bitmaps with a granularity of 58 or above can be neither serialized nor
deserialized (see the comment in the function added in this series for
an explanation). This patch adds a function so that we can check whether
a bitmap actually can be (de-)serialized at all, thus avoiding failing
the necessary assertion in hbitmap_serialization_granularity().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20161115225746.3590-2-mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-01-26 10:25:01 +08:00
Peter Maydell
ffb5a69c31 trivial patches for 2017-01-24
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCAAGBQJYh7icAAoJEHAbT2saaT5ZixMH/2qr2TPaAARnTPFzf/mfpHvR
 jYKZary6L//DTCqjrys5zAVzKUg8rCPGwWI2T2FDsos7Ku4MKBBSfDmnabc+iu0P
 7Rkr18dPGi5ozAiHcGzNXivODVrXBqZT3KcJZ1aYo04Bl0xszxO+fWp2B6n9aXIs
 g4HFq98XGXut8Rs7wNcsUOGHTkIupnzxt+TYXFhezRPq/6bRWZj8pPjwiPReZJBP
 w6IhlVkIxsMdW1tpy+Im21aKCWO23mvQYj+ZiS2eb2F/jcSshL9xp1vqlbNU65H1
 w/zQaUE+m0yJhF7sVKM76101vnDJ1DPxiD/45BnF5p/xwiYcUwpS5UG53riFxAA=
 =B6et
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-fetch' into staging

trivial patches for 2017-01-24

# gpg: Signature made Tue 24 Jan 2017 20:27:08 GMT
# gpg:                using RSA key 0x701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931  4B22 701B 4F6B 1A69 3E59

* remotes/mjt/tags/trivial-patches-fetch: (31 commits)
  hw/isa/isa-bus: Set category of the "isabus-bridge" device
  usb: Set category and description of the MTP device
  gdbstub.c: update old error report statements
  gdbstub.c: fix GDB connection segfault caused by empty machines
  scsi-disk: add 'fall through' comment to switch VERIFY cases
  Drop duplicate display option documentation
  hw/display/framebuffer.c: Avoid overflow for framebuffers > 4GB
  win32: use glib gpoll if glib >= 2.50
  util/mmap-alloc: refactor a little bit for readability
  util/mmap-alloc: check parameter before using
  vfio: remove a duplicated word in comments
  docs: sync pci-ids.txt
  disas/cris.c: Fix Coverity warning about unchecked NULL
  lm32: milkymist-tmu2: fix another integer overflow
  hw/i386/kvmvapic: Remove dead code in patch_hypercalls()
  doc/usb2: fix typo
  qga: fix erroneous argument to strerror
  block: remove dead check
  pci-assign: avoid pointless stat
  qemu-img: remove dead check
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-01-25 10:42:26 +00:00
Stefan Weil
5bb8590d37 include: Fix typos found by codespell
Add also a missing parenthesis in a comment.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2017-01-24 23:26:52 +03:00
Jianjun Duan
94869d5c52 migration: migrate QTAILQ
Currently we cannot directly transfer a QTAILQ instance because of the
limitation in the migration code. Here we introduce an approach to
transfer such structures. We created VMStateInfo vmstate_info_qtailq
for QTAILQ. Similar VMStateInfo can be created for other data structures
such as list.

When a QTAILQ is migrated from source to target, it is appended to the
corresponding QTAILQ structure, which is assumed to have been properly
initialized.

This approach will be used to transfer pending_events and ccs_list in spapr
state.

We also create some macros in qemu/queue.h to access a QTAILQ using pointer
arithmetic. This ensures that we do not depend on the implementation
details about QTAILQ in the migration code.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Message-Id: <1484852453-12728-3-git-send-email-duanj@linux.vnet.ibm.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-01-24 17:54:47 +00:00
Eduardo Habkost
726401be06 arch_init: Remove unnecessary default_config_files table
The existing default_config_files table in arch_init.c has a
single entry, making it completely unnecessary. The whole code
can be replaced by a single qemu_read_config_file() call in vl.c.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170117180051.11958-1-ehabkost@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-01-23 21:25:36 -02:00
Daniel P. Berrange
c1b412f1d9 io: introduce a DNS resolver API
Currently DNS resolution is done automatically as part
of the creation of a QIOChannelSocket object instance.
This works ok for network clients where you just end
up a single network socket, but for servers, the results
of DNS resolution may require creation of multiple
sockets.

Introducing a DNS resolver API allows DNS resolution
to be separated from the socket object creation. This
will make it practical to create multiple QIOChannelSocket
instances for servers.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-01-23 15:32:46 +00:00
Peter Maydell
598cf1c805 * QOM interface fix (Eduardo)
* RTC fixes (Gaohuai, Igor)
 * Memory leak fixes (Li Qiang, me)
 * Ctrl-a b regression (Marc-André)
 * Stubs cleanups and fixes (Leif, me)
 * hxtool tweak (me)
 * HAX support (Vincent)
 * QemuThread, exec.c and SCSI fixes (Roman, Xinhua, me)
 * PC_COMPAT_2_8 fix (Marcelo)
 * stronger bitmap assertions (Peter)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJYggc9FBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 5pMH/092iVHw1la8VmphQd8W7hkCHckvVbwaEJ+n4BP8MjeUNmYFJX+op9Qlpqfe
 ekYqQgK69v2UwuofVK2gqS+Y2EyFHivTESk5pS3SM3lTewV1fzCM/HVG3pTxV/ol
 V+eBnp+shrfNG3Eg7YThTqx4LkDUp24Pd3HJVblQZMVpqGzL2xUuUQzSf8F/eeQJ
 xO61pm0ovpCY5MCg3kPLx8GIkPAmcXo5jhMCTz5aLnQW6TO/mwx271a4UE2RTLZ7
 cFjNhxdGSzlnn2RwId4HVYWGU42taW6mpa8NX1hVVUXa1A2qlAfi5N/WLaH0aGYR
 J5ZTIaXdPUBx2SrUmd8udj4a818=
 =H5BQ
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* QOM interface fix (Eduardo)
* RTC fixes (Gaohuai, Igor)
* Memory leak fixes (Li Qiang, me)
* Ctrl-a b regression (Marc-André)
* Stubs cleanups and fixes (Leif, me)
* hxtool tweak (me)
* HAX support (Vincent)
* QemuThread, exec.c and SCSI fixes (Roman, Xinhua, me)
* PC_COMPAT_2_8 fix (Marcelo)
* stronger bitmap assertions (Peter)

# gpg: Signature made Fri 20 Jan 2017 12:49:01 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (35 commits)
  pc.h: move x-mach-use-reliable-get-clock compat entry to PC_COMPAT_2_8
  bitmap: assert that start and nr are non negative
  Revert "win32: don't run subprocess tests on Mingw32 platform"
  hax: add Darwin support
  Plumb the HAXM-based hardware acceleration support
  target/i386: Add Intel HAX files
  kvm: move cpu synchronization code
  KVM: PPC: eliminate unnecessary duplicate constants
  ramblock-notifier: new
  char: fix ctrl-a b not working
  exec: Add missing rcu_read_unlock
  x86: ioapic: fix fail migration when irqchip=split
  x86: ioapic: dump version for "info ioapic"
  x86: ioapic: add traces for ioapic
  hxtool: emit Texinfo headings as @subsection
  qemu-thread: fix qemu_thread_set_name() race in qemu_thread_create()
  serial: fix memory leak in serial exit
  scsi-block: fix direction of BYTCHK test for VERIFY commands
  pc: fix crash in rtc_set_memory() if initial cpu is marked as hotplugged
  acpi: filter based on CONFIG_ACPI_X86 rather than TARGET
  ...

# Conflicts:
#	include/hw/i386/pc.h
2017-01-20 16:42:07 +00:00
Paolo Bonzini
d6da1e9eca event_notifier: cleanups around event_notifier_set_handler
Remove the useless is_external argument.  Since the iohandler
AioContext is never used for block devices, aio_disable_external
is never called on it.  This lets us remove stubs/iohandler.c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-16 17:52:35 +01:00
Paolo Bonzini
fbcc3e5004 qemu-thread: optimize QemuLockCnt with futexes on Linux
This is complex, but I think it is reasonably documented in the source.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170112180800.21085-5-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-16 13:25:18 +00:00
Paolo Bonzini
51dee5e465 qemu-thread: introduce QemuLockCnt
A QemuLockCnt comprises a counter and a mutex, with primitives
to increment and decrement the counter, and to take and release the
mutex.  It can be used to do lock-free visits to a data structure
whenever mutexes would be too heavy-weight and the critical section
is too long for RCU.

This could be implemented simply by protecting the counter with the
mutex, but QemuLockCnt is harder to misuse and more efficient.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170112180800.21085-3-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-01-16 13:25:17 +00:00
Richard Henderson
7bdcecb7b2 qemu/host-utils.h: Reduce the operation count in the fallback ctpop
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-01-10 08:49:59 -08:00
Kevin Wolf
536fca7f7e coroutine: Introduce qemu_coroutine_enter_if_inactive()
In the context of asynchronous work, if we have a worker coroutine that
didn't yield, the parent coroutine cannot be reentered because it hasn't
yielded yet. In this case we don't even have to reenter the parent
because it will see that the work is already done and won't even yield.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2017-01-09 13:30:52 +01:00
Yaowei Bai
11717bc93a main-loop: update comment for qemu_mutex_lock/unlock_iothread
Commit 49cf57281b (vl: delay thread initialization after daemonization)
makes the global mutex is taken after daemonization instead before
daemonization by qemu_init_main_loop().

Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Message-Id: <1480566640-27264-2-git-send-email-baiyaowei@cmss.chinamobile.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:24 +01:00
Yaowei Bai
45241cf9d7 timer: fix misleading comment in timer.h
It's timer to expire, not clock.

Signed-off-by: Yaowei Bai <baiyaowei@cmss.chinamobile.com>
Message-Id: <1480566640-27264-1-git-send-email-baiyaowei@cmss.chinamobile.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:24 +01:00
Paolo Bonzini
1f4e496e1f exec: introduce MemoryRegionCache
Device models often have to perform multiple access to a single
memory region that is known in advance, but would to use "DMA-style"
functions instead of address_space_map/unmap.  This can happen
for example when the data has to undergo endianness conversion.
Introduce a new data structure to cache the result of
address_space_translate without forcing usage of a host address
like address_space_map does.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:23 +01:00
Nikunj A Dadhania
ecce0369b8 bitops: fix rol/ror when shift is zero
All the variants for rol/ror have a bug in case where the shift == 0.
For example rol32, would generate:

    return (word << 0) | (word >> 32);

Which though works, would be flagged as a runtime error on clang's
sanitizer.

Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-11-15 10:05:50 +11:00