qemu_aio_coroutine_enter() is (indirectly) called recursively when
processing co_queue_wakeup. This can lead to stack exhaustion.
This patch rewrites co_queue_wakeup in an iterative fashion (instead of
recursive) with bounded memory usage to prevent stack exhaustion.
qemu_co_queue_run_restart() is inlined into qemu_aio_coroutine_enter()
and the qemu_coroutine_enter() call is turned into a loop to avoid
recursion.
There is one change that is worth mentioning: Previously, when
coroutine A queued coroutine B, qemu_co_queue_run_restart() entered
coroutine B from coroutine A. If A was terminating then it would still
stay alive until B yielded. After this patch B is entered by A's parent
so that a A can be deleted immediately if it is terminating.
It is safe to make this change since B could never interact with A if it
was terminating anyway.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180322152834.12656-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
QSIMPLEQ_CONCAT(a, b) joins a = a + b. The new QSIMPLEQ_PREPEND(a, b)
API joins a = b + a.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180322152834.12656-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
_Static_assert() allows us to specify messages, and that may come in
handy. Even without _Static_assert(), encouraging developers to put a
helpful message next to the QEMU_BUILD_BUG_* may make debugging easier
whenever it breaks.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20180224154033.29559-2-mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
* SCSI fix to pass maximum transfer size (Daniel Barboza)
* chardev fixes and improved iothread support (Daniel Berrangé, Peter)
* checkpatch tweak (Eric)
* make help tweak (Marc-André)
* make more PCI NICs available with -net or -nic (myself)
* change default q35 NIC to e1000e (myself)
* SCSI support for NDOB bit (myself)
* membarrier system call support (myself)
* SuperIO refactoring (Philippe)
* miscellaneous cleanups and fixes (Thomas)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQEcBAABAgAGBQJapqaMAAoJEL/70l94x66DQoUH/Rvg+a8giz/SrEA4P8D3Cb2z
4GNbNUUoy4oU0ltD5IAMskMwpOsvl1batE0D+pKIlfO9NV4+Cj2kpgo0p9TxoYqM
VCby3wRtx27zb5nVytC6M++iIKXmeEMqXmFw61I6umddNPSl4IR3hiHEE0DM+7dV
UPIOvJeEiazyQaw3Iw+ZctNn8dDBKc/+6oxP9xRcYTaZ6hB4G9RZkqGNNSLcJkk7
R0UotdjzIZhyWMOkjIwlpTF4sWv8gsYUV4bPYKMYho5B0Obda2dBM3I1kpA8yDa/
xZ5lheOaAVBZvM5aMIcaQPa65MO9hLyXFmhMOgyfpJhLBBz6Qpa4OLLI6DeTN+0=
=UAgA
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Record-replay lockstep execution, log dumper and fixes (Alex, Pavel)
* SCSI fix to pass maximum transfer size (Daniel Barboza)
* chardev fixes and improved iothread support (Daniel Berrangé, Peter)
* checkpatch tweak (Eric)
* make help tweak (Marc-André)
* make more PCI NICs available with -net or -nic (myself)
* change default q35 NIC to e1000e (myself)
* SCSI support for NDOB bit (myself)
* membarrier system call support (myself)
* SuperIO refactoring (Philippe)
* miscellaneous cleanups and fixes (Thomas)
# gpg: Signature made Mon 12 Mar 2018 16:10:52 GMT
# gpg: using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (69 commits)
tcg: fix cpu_io_recompile
replay: update documentation
replay: save vmstate of the asynchronous events
replay: don't process async events when warping the clock
scripts/replay-dump.py: replay log dumper
replay: avoid recursive call of checkpoints
replay: check return values of fwrite
replay: push replay_mutex_lock up the call tree
replay: don't destroy mutex at exit
replay: make locking visible outside replay code
replay/replay-internal.c: track holding of replay_lock
replay/replay.c: bump REPLAY_VERSION again
replay: save prior value of the host clock
replay: added replay log format description
replay: fix save/load vm for non-empty queue
replay: fixed replay_enable_events
replay: fix processing async events
cpu-exec: fix exception_index handling
hw/i386/pc: Factor out the superio code
hw/alpha/dp264: Use the TYPE_SMC37C669_SUPERIO
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# default-configs/i386-softmmu.mak
# default-configs/x86_64-softmmu.mak
The fd_is_socket() helper method is useful in a few places, so put it in
the common sockets code. Make the code more compact while moving it.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are qemu_strtoNN functions for various sized integers. This adds two
more for plain int & unsigned int types, with suitable range checking.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
iQEcBAABAgAGBQJapqP6AAoJEJykq7OBq3PIZasIAL+NKBQGa/e0FD28PYdLU/JE
sKZZ0O6+eVTCejGXap4bzbKOy+qZyOXvaRk5KNREc5A9R8HFBt5GotMfE80Cw9Nt
rryX+qVdf4w27u2jMqY4215jD5jy/nPijRQ0a8UBsi6z2PXVPPNeS3lMB8RSFEZS
IZu+l3j1op1wUlM4GfZvLCjmgHC+73lk6a5xZLJ2UvH9UoqJepgVZnSs2YvOctzG
LVGMhk6/yAy4hh3NWx/M2h2B2ASMJJya8XrLgelAVnr6CxKBeBII0bSPur+1YIH7
OkJhNsk6QKSWNFKtzXE6N+y1ryWLnbE8vzKSZt+xSzUDjhnqTm5iFpZQ+Ed16qA=
=nCAn
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Mon 12 Mar 2018 15:59:54 GMT
# gpg: using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/tracing-pull-request:
trace: only permit standard C types and fixed size integer types
trace: remove use of QEMU specific types from trace probes
trace: include filename when printing parser error messages
simpletrace: fix timestamp argument type
log-for-trace.h: Split out parts of log.h used by trace.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds saving/restoring of the host clock field 'last'.
It is used in host clock calculation and therefore clock may
become incorrect when using restored vmstate.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180227095226.1060.50975.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Actually enable the global memory barriers if supported by the OS.
Because only recent versions of Linux include the support, they
are disabled by default. Note that it also has to be disabled
for QEMU to run under Wine.
Before this patch, rcutorture reports 85 ns/read for my machine,
after the patch it reports 12.5 ns/read. On the other hand updates
go from 50 *micro*seconds to 20 *milli*seconds.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This new header file provides heavy-weight "global" memory barriers that
enforce memory ordering on each running thread belonging to the current
process. For now, use a dummy implementation that issues memory barriers
on both sides (matching what QEMU has been doing so far).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Prepare for introducing smp_mb_placeholder() and smp_mb_global().
The new smp_mb() in synchronize_rcu() is not strictly necessary, since
the first atomic_mb_set for rcu_gp_ctr provides the required ordering.
However, synchronize_rcu is not performance critical, and it *will* be
necessary to introduce a smp_mb_global before calling wait_for_readers().
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A persistent build problem we see is where a source file
accidentally omits the #include of log.h. This slips through
local developer testing because if you configure with the
default (log) trace backend trace.h will pull in log.h for you.
Compilation fails only if some other backend is selected.
To make this error cause a compile failure regardless of
the configured trace backend, split out the parts of log.h
that trace.h requires into a new log-for-trace.h header.
Since almost all manual uses of the log.h functions will
use constants or functions which aren't in log-for-trace.h,
this will let us catch missing #include "qemu/log.h" more
consistently.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180213140029.8308-1-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Make audio_driver_lookup() try load the module in case it doesn't find
the driver in the registry. Also load all modules for -audio-help, so
the help output includes the help text for modular audio drivers.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180306074053.22856-3-kraxel@redhat.com
This allows, given a QemuOpts for a QemuOptsList that was merged from
multiple QemuOptsList, to only consider those options that exist in one
specific list. Block drivers need this to separate format-layer create
options from protocol-level options.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Current GCC has an optimization bug when compiling with ASAN.
See also GCC bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84307
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180215212552.26997-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If a requested user interface is not available, try loading it as
module, simliar to block layer modules. Needed to keep things working
when followup patches start to build user interfaces as modules.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180301100547.18962-8-kraxel@redhat.com
In my "build everything" tree, a change to the types in
qapi-schema.json triggers a recompile of about 4800 out of 5100
objects.
The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h,
qapi-types.h. Each of these headers still includes all its shards.
Reduce compile time by including just the shards we actually need.
To illustrate the benefits: adding a type to qapi/migration.json now
recompiles some 2300 instead of 4800 objects. The next commit will
improve it further.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-24-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
The main culprit here is bswap.h which pulled in softfloat.h so it
could use the types in its CPU_Float* and ldfl/stfql functions. As
bswap.h is very widely included this added a compile dependency every
time we touch softfloat.h. Move the typedefs for each float type into
their own file so we don't re-build the world every time we tweak the
main softfloat.h header.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Currently only file backed memory backend can
be created with a "share" flag in order to allow
sharing guest RAM with other processes in the host.
Add the "share" flag also to RAM Memory Backend
in order to allow remapping parts of the guest RAM
to different host virtual addresses. This is needed
by the RDMA devices in order to remap non-contiguous
QEMU virtual addresses to a contiguous virtual address range.
Moved the "share" flag to the Host Memory base class,
modified phys_mem_alloc to include the new parameter
and a new interface memory_region_init_ram_shared_nomigrate.
There are no functional changes if the new flag is not used.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
It is possible for rate limited writes to keep overshooting a slice's
quota by a tiny amount causing the slice-aligned waiting period to
effectively halve the rate.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20180207071758.6818-1-w.bumiller@proxmox.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qemu-common.h includes qemu/option.h, but most places that include the
former don't actually need the latter. Drop the include, and add it
to the places that actually need it.
While there, drop superfluous includes of both headers, and
separate #include from file comment with a blank line.
This cleanup makes the number of objects depending on qemu/option.h
drop from 4545 (out of 4743) to 284 in my "build everything" tree.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-20-armbru@redhat.com>
[Semantic conflict with commit bdd6a90a9e in block/nvme.c resolved]
This cleanup makes the number of objects depending on qapi/qmp/qdict.h
drop from 4550 (out of 4743) to 368 in my "build everything" tree.
For qapi/qmp/qobject.h, the number drops from 4552 to 390.
While there, separate #include from file comment with a blank line.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-13-armbru@redhat.com>
This renders many inclusions of qapi/qmp/q*.h superfluous. They'll be
dropped in the next few commits.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-8-armbru@redhat.com>
This is a library to manage the host vfio interface, which could be used
to implement userspace device driver code in QEMU such as NVMe or net
controllers.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180116060901.17413-3-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
qemu_co_queue_next does not need to release and re-acquire the mutex,
because the queued coroutine does not run immediately. However, this
does not hold for qemu_co_enter_next. Now that qemu_co_queue_wait
can synchronize (via QemuLockable) with code that is not running in
coroutine context, it's important that code using qemu_co_enter_next
can easily use a standardized locking idiom.
First of all, qemu_co_enter_next must use aio_co_wake to restart the
coroutine. Second, the function gains a second argument, a QemuLockable*,
and the comments of qemu_co_queue_next and qemu_co_queue_restart_all
are adjusted to clarify the difference.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-5-pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
There are cases in which a queued coroutine must be restarted from
non-coroutine context (with qemu_co_enter_next). In this cases,
qemu_co_enter_next also needs to be thread-safe, but it cannot use
a CoMutex and so cannot qemu_co_queue_wait. Use QemuLockable so
that the CoQueue can interchangeably use CoMutex or QemuMutex.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-4-pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
QemuLockable is a polymorphic lock type that takes an object and
knows which function to use for locking and unlocking. The
implementation could use C11 _Generic, but since the support is
not very widespread I am instead using __builtin_choose_expr and
__builtin_types_compatible_p, which are already used by
include/qemu/atomic.h.
QemuLockable can be used to implement lock guards, or to pass around
a lock in such a way that a function can release it and re-acquire it.
The next patch will do this for CoQueue.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-3-pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Linux commit 749df87bd7bee5a79cef073f5d032ddb2b211de8 (v4.14-rc1)
added a new flag MFD_HUGETLB to memfd_create() that specify the file
to be created resides in the hugetlbfs filesystem. This is the
generic hugetlbfs filesystem not associated with any specific mount
point.
hugetlbfs does not support sealing operations in v4.14, therefore
specifying MFD_ALLOW_SEALING with MFD_HUGETLB will result in EINVAL.
However, I added sealing support in "[PATCH v3 0/9] memfd: add sealing
to hugetlb-backed memory" series, queued in -mm tree for v4.16.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will allow callers to silence error report when the call is
allowed to failed.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-2-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It helps ASAN to detect more leaks on coroutine stacks, and to get rid
of some extra warnings.
Before:
tests/test-coroutine -p
/basic/lifecycle
/basic/lifecycle: ==20781==WARNING: ASan doesn't fully support
makecontext/swapcontext functions and may produce false positives in
some cases!
==20781==WARNING: ASan is ignoring requested __asan_handle_no_return:
stack top: 0x7ffcb184d000; bottom 0x7ff6c4cfd000; size: 0x0005ecb50000
(25446121472)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
OK
After:
tests/test-coroutine -p /basic/lifecycle
/basic/lifecycle: ==21110==WARNING: ASan doesn't fully support
makecontext/swapcontext functions and may produce false positives in
some cases!
OK
A similar work would need to be done for sigaltstack & windows fibers
to have similar coverage. Since ucontext is preferred, I didn't bother
checking the other coroutine implementations for now.
Update travis to fix the build with ASAN annotations.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180116151152.4040-4-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We dropped support for ia64 host CPUs in the 2.11 release (removing
the TCG backend for it, and advertising the support as being
completely removed in the changelog). However there are a few bits
and pieces of code still floating about. Remove those, too.
We can drop the check in configure for "ia64 or hppa host?"
entirely, because we don't support hppa hosts either any more.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1516897189-11035-1-git-send-email-peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit f87d72f5c5 as that is
part of a patchset reported to break cleanup and migration.
Cc: Gal Hammer <ghammer@redhat.com>
Cc: Sitong Liu <siliu@redhat.com>
Cc: Xiaoling Gao <xiagao@redhat.com>
Suggested-by: Greg Kurz <groug@kaod.org>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reported-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Add a function to only create a memfd, without mmap. The function is
used in the following memory backend.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20171023141815.17709-2-marcandre.lureau@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This function should be declared in generic header file so we can
utilize it.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Adding a cleanup callback function to the EventNotifier struct
which allows users to execute event_notifier_cleanup in a
different context.
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes leaks such as:
Direct leak of 2 byte(s) in 1 object(s) allocated from:
#0 0x7eff58beb850 in malloc (/lib64/libasan.so.4+0xde850)
#1 0x7eff57942f0c in g_malloc ../glib/gmem.c:94
#2 0x7eff579431cf in g_malloc_n ../glib/gmem.c:331
#3 0x7eff5795f6eb in g_strdup ../glib/gstrfuncs.c:363
#4 0x55db720f1d46 in readline_hist_add /home/elmarco/src/qq/util/readline.c:258
#5 0x55db720f2d34 in readline_handle_byte /home/elmarco/src/qq/util/readline.c:387
#6 0x55db71539d00 in monitor_read /home/elmarco/src/qq/monitor.c:3896
#7 0x55db71f9be35 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:167
#8 0x55db71f9bed3 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:179
#9 0x55db71fa013c in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66
#10 0x55db71fe18a8 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84
#11 0x7eff5793a90b in g_main_dispatch ../glib/gmain.c:3182
#12 0x7eff5793b7ac in g_main_context_dispatch ../glib/gmain.c:3847
#13 0x55db720af3bd in glib_pollfds_poll /home/elmarco/src/qq/util/main-loop.c:214
#14 0x55db720af505 in os_host_main_loop_wait /home/elmarco/src/qq/util/main-loop.c:261
#15 0x55db720af6d6 in main_loop_wait /home/elmarco/src/qq/util/main-loop.c:515
#16 0x55db7184e0de in main_loop /home/elmarco/src/qq/vl.c:1995
#17 0x55db7185e956 in main /home/elmarco/src/qq/vl.c:4914
#18 0x7eff4ea17039 in __libc_start_main (/lib64/libc.so.6+0x21039)
(while at it, use g_new0(ReadLineState), it's a bit easier to read)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180104160523.22995-11-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With no fixed array allocation, we can't overflow a buffer.
This will be important as optimizations related to host vectors
may expand the number of ops used.
Use QTAILQ to link the ops together.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This file begins tracking the files that will be the code base for HVF
support in QEMU. This code base is part of Google's QEMU version of
their Android emulator, and can be found at
https://android.googlesource.com/platform/external/qemu/+/emu-master-dev
This code is based on Veertu Inc's vdhh (Veertu Desktop Hosted
Hypervisor), found at https://github.com/veertuinc/vdhh. Everything is
appropriately licensed under GPL v2-or-later, except for the code inside
x86_task.c and x86_task.h, which, deriving from KVM (the Linux kernel),
is licensed GPL v2-only.
This code base already implements a very great deal of functionality,
although Google's version removed from Vertuu's the support for APIC
page and hyperv-related stuff. According to the Android Emulator Release
Notes, Revision 26.1.3 (August 2017), "Hypervisor.framework is now
enabled by default on macOS for 32-bit x86 images to improve performance
and macOS compatibility", although we better use with caution for, as the
same Revision warns us, "If you experience issues with it specifically,
please file a bug report...". The code hasn't seen much update in the
last 5 months, so I think that we can further develop the code with
occasional visiting Google's repository to see if there has been any
update.
On top of Google's code, the following changes were made:
- add code to the configure script to support the --enable-hvf argument.
If the OS is Darwin, it checks for presence of HVF in the system. The
patch also adds strings related to HVF in the file qemu-options.hx.
QEMU will only support the modern syntax style '-M accel=hvf' no enable
hvf; the legacy '-enable-hvf' will not be supported.
- fix styling issues
- add glue code to cpus.c
- move HVFX86EmulatorState field to CPUX86State, changing the
the emulation functions to have a parameter with signature 'CPUX86State *'
instead of 'CPUState *' so we don't have to get the 'env'.
Signed-off-by: Sergio Andres Gomez Del Real <Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-2-Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-3-Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-5-Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-6-Sergio.G.DelReal@gmail.com>
Message-Id: <20170905035457.3753-7-Sergio.G.DelReal@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When listening on unix/tcp sockets there was optional code that would update
the original SocketAddress struct with the info about the actual address that
was listened on. Since the conversion of everything to QIOChannelSocket, no
remaining caller made use of this feature. It has been replaced with the ability
to query the listen address after the fact using the function
qio_channel_socket_get_local_address. This is a better model when the input
address can result in listening on multiple distinct sockets.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171212111219.32601-1-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's going to be useful, in particular, in VMBus code massively using
uuids aka GUIDs.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20171127124355.26015-1-rkagan@virtuozzo.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Their last user went away in commit f51074cdc6, "pci-hotplug-old: Has
been dead for five major releases, bury", v2.3.0. Remove them, as new
code should use QemuOpts or maybe keyval_parse() instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171006131645.17729-1-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The AioContext pointer argument to co_aio_sleep_ns() is only used for
the sleep timer. It does not affect where the caller coroutine is
resumed.
Due to changes to coroutine and AIO APIs it is now possible to drop the
AioContext pointer argument. This is safe to do since no caller has
specific requirements for which AioContext the timer must run in.
This patch drops the AioContext pointer argument and renames the
function to simplify the API.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171109102652.6360-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The function searches for next zero bit.
Also add interface for BdrvDirtyBitmap and unit test.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20171012135313.227864-2-vsementsov@virtuozzo.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
SPARC Linux has an oddity that it insists that mmap()
of MAP_FIXED memory must be at an alignment defined by
SHMLBA, which is more aligned than the page size
(typically, SHMLBA alignment is to 16K, and pages are 8K).
This is a relic of ancient hardware that had cache
aliasing constraints, but even on modern hardware the
kernel still insists on the alignment.
To ensure that we get mmap() alignment sufficient to
make the kernel happy, change QEMU_VMALLOC_ALIGN,
qemu_fd_getpagesize() and qemu_mempath_getpagesize()
to use the maximum of getpagesize() and SHMLBA.
In particular, this allows 'make check' to pass on Sparc:
we were previously failing the ivshmem tests.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1512752248-17857-1-git-send-email-peter.maydell@linaro.org
In our various supported host OSes, the time_t type may be either 32
or 64 bit, and could in theory also be either signed or unsigned.
Notably, in OpenBSD time_t is a 64 bit type even if 'long' is 32
bits, so using LONG_MAX for TIME_MAX is incorrect.
Use an approach suggested by Paolo Bonzini which calculates
the maximum value of the type rather than hardcoding it;
to do this we use the TYPE_MAXIMUM macro from Gnulib.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1511452598-6077-1-git-send-email-peter.maydell@linaro.org
The previous patch fixed a race condition, in which there were
coroutines being executing doubly, or after coroutine deletion.
We can detect common scenarios when this happens, and print an error
message and abort before we corrupt memory / data, or segfault.
This patch will abort if an attempt to enter a coroutine is made while
it is currently pending execution, either in a specific AioContext bh,
or pending execution via a timer. It will also abort if a coroutine
is scheduled, before a prior scheduled run has occurred.
We cannot rely on the existing co->caller check for recursive re-entry
to catch this, as the coroutine may run and exit with
COROUTINE_TERMINATE before the scheduled coroutine executes.
(This is the scenario that was occurring and fixed in the previous
patch).
This patch also re-orders the Coroutine struct elements in an attempt to
optimize caching.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
We never noticed because it has no users.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1510273811-13419-1-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This patch adds support for dma-bufs to the qemu console interfaces.
It adds a new "struct QemuDmaBuf" to represent a dmabuf with accociated
metatdata (size, format). It adds three functions (and
DisplayChangeListenerOps operations) to set a dma-buf as display
scanout, as cursor and to release a dmabuf.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171010135453.6704-2-kraxel@redhat.com
The header file was introduced by fbcc3e5 ("qemu-thread: optimize QemuLockCnt
with futexes on Linux", 2017-01-16) without header guards. Add them.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
These only depend on the host and therefore belong in the common
osdep, not in a target-dependent object.
While at it, query the host during an init constructor, which guarantees
the page size will be well-defined throughout the execution of the program.
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The only client of hbitmap_serialization_granularity() is dirty-bitmap's
bdrv_dirty_bitmap_serialization_align(). Keeping the two names consistent
is worthwhile, and the shorter name is more representative of what the
function returns (the required alignment to be used for start/count of
other serialization functions, where violating the alignment causes
assertion failures).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When using bit-wise operations that exploit the power-of-two
nature of the second argument of ROUND_UP(), we still need to
ensure that the mask is as wide as the first argument (done
by using a ternary to force proper arithmetic promotion).
Unpatched, ROUND_UP(2ULL*1024*1024*1024*1024, 512U) produces 0,
instead of the intended 2TiB, because negation of an unsigned
32-bit quantity followed by widening to 64-bits does not
sign-extend the mask.
Broken since its introduction in commit 292c8e50 (v1.5.0).
Callers that passed the same width type to both macro parameters,
or that had other code to ensure the first parameter's maximum
runtime value did not exceed the second parameter's width, are
unaffected, but I did not audit to see which (if any) existing
clients of the macro could trigger incorrect behavior (I found
the bug while adding a new use of the macro).
While preparing the patch, checkpatch complained about poor
spacing, so I also fixed that here and in the nearby DIV_ROUND_UP.
CC: qemu-trivial@nongnu.org
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In qemu-thread-posix.c we have two implementations of the
various qemu_sem_* functions, one of which uses native POSIX
sem_* and the other of which emulates them with pthread conditions.
This is necessary because not all our host OSes support
sem_timedwait().
Instead of a hard-coded list of OSes which don't implement
sem_timedwait(), which gets out of date, make configure
test for the presence of the function and set a new
CONFIG_HAVE_SEM_TIMEDWAIT appropriately.
In particular, newer NetBSDs have sem_timedwait(), so this
commit will switch them over to using it. OSX still does
not have an implementation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Provide helpers to convert bitmaps to little endian format. It can be
used when we want to send one bitmap via network to some other hosts.
One thing to mention is that, these helpers only solve the problem of
endianess, but it does not solve the problem of different word size on
machines (the bitmaps managing same count of bits may contains different
size when malloced). So we need to take care of the size alignment issue
on the callers for now.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Count how many bits set in the bitmap.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It's possible for address_space_get_flatview() as it currently stands
to cause a use-after-free for the returned FlatView, if the reference
count is incremented after the FlatView has been replaced by a writer:
thread 1 thread 2 RCU thread
-------------------------------------------------------------
rcu_read_lock
read as->current_map
set as->current_map
flatview_unref
'--> call_rcu
flatview_ref
[ref=1]
rcu_read_unlock
flatview_destroy
<badness>
Since FlatViews are not updated very often, we can just detect the
situation using a new atomic op atomic_fetch_inc_nonzero, similar to
Linux's atomic_inc_not_zero, which performs the refcount increment only if
it hasn't already hit zero. This is similar to Linux commit de09a9771a53
("CRED: Fix get_task_cred() and task_state() to not resurrect dead
credentials", 2010-07-29).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABCAAGBQJZwXs9AAoJECgHk2+YTcWmS18P/1OceEmetatwBQ6YWURA0VfU
CB+nCHxijlWXU8UiwoTVDOXc11P00V5VO0mMFouUbv+o+/80qyAfNl2DhDVq7ZXo
nhIFHKXUR1BX3YbcqfIonH8xCMGtrAexghhhaGPQbx450PqmGyyjBsZsXttNge1S
Yt+jzgD9drsZPYw4ZLJjT/tnWI/+8kDPn37jVujzRzApTD9/fq+77ZZq7q25RzQH
ISa5OXOQk6pq+tPHvaIFXfOqfhILcpM7u/X7MYVwiA1oBqcLXTDCjDX3cKavKade
+i3yrH7ahNgUO1DyT2bMZX6NAFoZDBrwlYgpkw8n+Yf+EUcdPDOHHEcEeaMpdTGx
wgWbQrs+xzIg/ulRb8Qqe9FwdXGQbehfFGofd+gnGQ0XuxekT82in+ucMOivQO3x
W/azGnzoz6D83stJFIZ93S69SRswqBuj2R8mu821yzqx1EUSNXgfKXz9OPwwFed5
El9YN127F/VAyp0av1CsOg1XgqIujMUGRxGf7eQBfkh1R3C/g2XNPTvR3yaY9L5B
zuMJfWLF6r6zL53ymt7/9UVEim295Lia3mNGS5/Min5QGms7edphsDsrubzXsZGq
2owWfAU/KeDH9gNVNNkZdLcEcS5TEz+2oGPR5oeDeB/QlzVdNQ3FeTVlzFavxNQa
8nrzeFcw7VNrIx2gdvsY
=LQ8U
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ehabkost/tags/machine-next-pull-request' into staging
Machine/CPU/NUMA queue, 2017-09-19
# gpg: Signature made Tue 19 Sep 2017 21:17:01 BST
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/machine-next-pull-request:
MAINTAINERS: Update git URLs for my trees
hw/acpi-build: Fix SRAT memory building in case of node 0 without RAM
NUMA: Replace MAX_NODES with nb_numa_nodes in for loop
numa: cpu: calculate/set default node-ids after all -numa CLI options are parsed
arm: drop intermediate cpu_model -> cpu type parsing and use cpu type directly
pc: use generic cpu_model parsing
vl.c: convert cpu_model to cpu type and set of global properties before machine_init()
cpu: make cpu_generic_init() abort QEMU on error
qom: cpus: split cpu_generic_init() on feature parsing and cpu creation parts
hostmem-file: Add "discard-data" option
osdep: Define QEMU_MADV_REMOVE
vl: Clean up user-creatable objects when exiting
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Define QEMU_MADV_REMOVE, so we can use it with qemu_madvise().
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170824192315.5897-3-ehabkost@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Zack Cornelius <zack.cornelius@kove.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Report amount of hotplugged memory in addition to total
amount per NUMA node.
Signed-off-by: Vadim Galitsyn <vadim.galitsyn@profitbricks.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: qemu-devel@nongnu.org
Message-Id: <20170829153022.27004-2-vadim.galitsyn@profitbricks.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZsEDbAAoJEAUWMx68W/3nxcwQAKNhcpoAKbcwQn5s9aOcR1Nd
9pTV/5ucfPI9rMo0cC1JKgVoJVhx1rXwyVuohC/+jZBLHjgTrN8yivo4WLiCEyFx
0DH4v1Lp/cnvhpNOUisda17lD9Josroi0GF13bbjvyKnyCnOOTEDwn/3YJZdwH2u
AkWVS+WwRPPqmqHfvjUiEfDyO8SXpk635eoHqLVrS1I5+uykPKxSqJAwZmEGnf5Y
y+k+VnIq1pVRt/6C/mw4tm1ZVjzeKm77AGMlnMIAvT9/n5TfolMlpSha9RG86avf
4QgJIA5duaWNj8Kt7LsnGmBUKndNgY2eFDV3IJA9ke5x9FkrO3S02W4LT1/4N4MU
P/art2z+jtKWVzxvdmz9S1fEKnmE44ULCbLJRVucoK5lYEgDdAfapO6Duq/hf+JW
jWeaiNlF/oJrMZH89HdJt25iGYBcYuwg1vc6QlGMQ0zHzFYw2xV2z3V3raHUUQD8
74iJeSM7O2oVTzgfeOBNpKVpVqQLPEY2B9vK8slZtUNsiziQjaWufuzYfn7Mi7kU
JZ29vfLx+3mK6Ateq/diQgHuJsWxM2LDe2DlhwBgmSjEFzWK/q8Eg/AzX+O4xDvv
Kd5KVpc+N7nFwiWRScplibPgFmzSHzAack8Kwi5s1JPj63kJcuR56hWQaJ9d4uA7
k5SN9mduPLVP2zkDGH53
=lrj8
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgilbert/tags/pull-migration-20170906a' into staging
migration pull 2017-09-06
# gpg: Signature made Wed 06 Sep 2017 19:39:23 BST
# gpg: using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20170906a:
migration: dump str in migrate_set_state trace
snapshot/tests: Try loadvm twice
migration: Reset rather than destroy main_thread_load_event
runstate/migrate: Two more transitions
host-utils: Simplify pow2ceil()
host-utils: Proactively fix pow2floor(), switch to unsigned
xbzrle: Drop unused cache_resize()
migration: Report when bdrv_inactivate_all fails
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1501148776-16890-4-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The function's stated contract is simple enough: "round down to the
nearest power of 2". Suggests the domain is the representable numbers
>= 1, because that's the smallest power of two.
The implementation doesn't check for domain errors, but returns
garbage instead:
* For negative arguments, pow2floor() returns -2^63, which is not even
a power of two, let alone the nearest one.
What sort of works is passing *unsigned* arguments >= 2^63. The
implicit conversion to signed is implementation defined, but
commonly yields the (negative) two's complement. pow2floor() then
returns -2^63. Callers that convert that back to unsigned get the
correct value 2^63.
* For a zero argument, pow2floor() shifts right by 64. Undefined
behavior. Common actual behavior is to shift by 0, yielding -2^63.
Fix by switching from int64_t to uint64_t and amending the contract to
map zero to zero.
Callers are fine with that:
* memory_access_size()
This function makes no sense unless the argument is positive and the
return value fits into int.
* raw_refresh_limits()
Passes an int between 1 and BDRV_REQUEST_MAX_BYTES.
* iscsi_refresh_limits()
Passes an integer between 0 and INT_MAX, converts the result to
uint32_t. Passing zero would be undefined behavior, but commonly
yield zero. The patch gives us the zero without the undefined
behavior.
* cache_init()
Passes a positive int64_t argument.
* xbzrle_cache_resize()
Passes a positive int64_t argument (>= TARGET_PAGE_SIZE, actually).
* spapr_node0_size()
Passes a positive uint64_t argument, and converts the result to
hwaddr, i.e. uint64_t.
* spapr_populate_memory()
Passes a positive hwaddr argument, and converts the result to
hwaddr.
Cc: Juan Quintela <quintela@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1501148776-16890-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
block/throttle.c uses existing I/O throttle infrastructure inside a
block filter driver. I/O operations are intercepted in the filter's
read/write coroutines, and referred to block/throttle-groups.c
The driver can be used with the syntax
-drive driver=throttle,file.filename=foo.qcow2,throttle-group=bar
which registers the throttle filter node with the ThrottleGroup 'bar'. The
given group must be created beforehand with object-add or -object.
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We were using -1 instead of the real size because the functions check
what is bigger, size in bytes or the size of the iov. Recent gcc's
barf at this.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Cleber Rosa <crosa@redhat.com>
--
Remove comments about this feature.
Fix missing -1.
ThrottleGroup is converted to an object. This will allow the future
throttle block filter drive easy creation and configuration of throttle
groups in QMP and cli.
A new QAPI struct, ThrottleLimits, is introduced to provide a shared
struct for all throttle configuration needs in QMP.
ThrottleGroups can be created via CLI as
-object throttle-group,id=foo,x-iops-total=100,x-..
where x-* are individual limit properties. Since we can't add non-scalar
properties in -object this interface must be used instead. However,
setting these properties must be disabled after initialization because
certain combinations of limits are forbidden and thus configuration
changes should be done in one transaction. The individual properties
will go away when support for non-scalar values in CLI is implemented
and thus are marked as experimental.
ThrottleGroup also has a `limits` property that uses the ThrottleLimits
struct. It can be used to create ThrottleGroups or set the
configuration in existing groups as follows:
{ "execute": "object-add",
"arguments": {
"qom-type": "throttle-group",
"id": "foo",
"props" : {
"limits": {
"iops-total": 100
}
}
}
}
{ "execute" : "qom-set",
"arguments" : {
"path" : "foo",
"property" : "limits",
"value" : {
"iops-total" : 99
}
}
}
This also means a group's configuration can be fetched with qom-get.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The non-blocking connect mechanism is obsolete, and it doesn't
work well in inet connection, because it will call getaddrinfo
first and getaddrinfo will blocks on DNS lookups. Since commit
e65c67e4 & d984464e, the non-blocking connect of migration goes
through QIOChannel in a different manner(using a thread), and
nobody use this old non-blocking connect anymore.
Any newly written code which needs a non-blocking connect should
use the QIOChannel code, so we can drop NonBlockingConnectHandler
as a concept entirely.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
LeakyBucket.burst_length is defined as an unsigned integer but the
code never checks for overflows and it only makes sure that the value
is not 0.
In practice this means that the user can set something like
throttling.iops-total-max-length=4294967300 despite being larger than
UINT_MAX and the final value after casting to unsigned int will be 4.
This patch changes the data type to uint64_t. This does not increase
the storage size of LeakyBucket, and allows us to assign the value
directly from qemu_opt_get_number() or BlockIOThrottle and then do the
checks directly in throttle_is_valid().
The value of burst_length does not have a specific upper limit,
but since the bucket size is defined by max * burst_length we have
to prevent overflows. Instead of going for UINT64_MAX or something
similar this patch reuses THROTTLE_VALUE_MAX, which allows I/O bursts
of 1 GiB/s for 10 days in a row.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1b2e3049803f71cafb2e1fa1be4fb47147a0d398.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Both the throttling limits set with the throttling.iops-* and
throttling.bps-* options and their QMP equivalents defined in the
BlockIOThrottle struct are integer values.
Those limits are also reported in the BlockDeviceInfo struct and they
are integers there as well.
Therefore there's no reason to store them internally as double and do
the conversion everytime we're setting or querying them, so this patch
uses uint64_t for those types. Let's also use an unsigned type because
we don't allow negative values anyway.
LeakyBucket.level and LeakyBucket.burst_level do however remain double
because their value changes depending on the fraction of time elapsed
since the previous I/O operation.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: f29b840422767b5be2c41c2dfdbbbf6c5f8fedf8.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The level of the burst bucket is stored in bkt.burst_level, not
bkt.burst_length.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Message-id: 49aab2711d02f285567f3b3b13a113847af33812.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Build time check of OFD lock is not sufficient and can cause image open
errors when the runtime environment doesn't support it.
Add a helper function to probe it at runtime, additionally. Also provide
a qemu_has_ofd_lock() for callers to check the status.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This reverts commit a59629fcc6.
This is not needed anymore because the IOThread mutex is not
"magic" anymore (need not kick the CPU thread)and also because
fork callbacks are only enabled at the very beginning of
QEMU's execution.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because of -daemonize, system mode QEMU sometimes needs to fork() and
keep RCU enabled in the child. However, there is a possible deadlock
with synchronize_rcu:
- the CPU thread is inside a RCU critical section and wants to take
the BQL in order to do MMIO
- the monitor thread, which is owning the BQL, calls rcu_init_lock
which tries to take the rcu_sync_lock
- the call_rcu thread has taken rcu_sync_lock in synchronize_rcu, but
synchronize_rcu needs the CPU thread to end the critical section
before returning.
This cannot happen for user-mode emulation, because it does not have
a BQL.
To fix it, assume that system mode QEMU only forks in preparation for
exec (except when daemonizing) and disable pthread_atfork as soon as
the double fork has happened.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With the move of some docs/ to docs/devel/ on ac06724a71,
a couple of references were not updated.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZddzIAAoJEDhwtADrkYZTNWcQALGul4VFe3NobBGAyv0tOT13
1a1dtILD99IwC2aR5/ADqZyejMO7PiCNav7aQmQn3G7tFEleWMBZCj/zG0O756DF
UDxBLF9xpEu5i1uV9PAsW8K/FfZcsibAkDFCHpPJImOUtbijqJnhM2L++m+ctPvM
GlI0VRD8MA34qjmcra2QtapHSHPfi4SF33dcxiUxc3bTSKAJjarE4mhdR7Z8pj74
r1iu0w/2FvNs94LPIp1Ma7WtwRw4K1IypPf6815aaNNB49ahOWi/bHk9zNz4wYKG
Kt63zAAie3WxzaPUH238ZXh7AtGpIYTKYNCAF93URqf28ZrfkFrlaYX08hRvWeSr
U17KA/1ud/9xdNMW2uKs9ANE+vuWpZ543in4xO9FC0Jmdbq7PBp4TzHNRq5qTvNm
zAc2Q2d7dUhd2uYqkzYdgo9MR9qx9TY66RqPw050Il6oLxQoOjFp02Fgg9Z2WF9Q
pOOPr6uuvLw8MoesEK0IQVtUxCWK2113VTSUgdw5Mr0KEaWMIVDigV47kkItWjw/
2mcwI+d/ZEEgagSaYhmw1qGUEVrZQ2qWQNLwcGpjfSHSAkG2V7lv3JEUmdQLFOLA
VXr2dtVHXRT/ZsxdoTze8O8At4tI1SDAIFI00H8oXi8Nbwtp3w0C40JApFNQHnku
bSJo536xTMeQbCiPbrst
=DCKd
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-07-18-v2' into staging
QAPI patches for 2017-07-18
# gpg: Signature made Mon 24 Jul 2017 12:40:56 BST
# gpg: using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2017-07-18-v2:
migration: Use JSON null instead of "" to reset parameter to default
migration: Unshare MigrationParameters struct for now
migration: Add TODO comments on duplication of QAPI_CLONE()
migration: Clean up around tls_creds, tls_hostname
hmp: Clean up and simplify hmp_migrate_set_parameter()
block: Use JSON null instead of "" to disable backing file
tests/test-qobject-input-visitor: Drop redundant test
qapi: Introduce a first class 'null' type
qapi: Use QNull for a more regular visit_type_null()
qapi: Separate type QNull from QObject
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Clang 3.9 passes the CONFIG_AVX2_OPT configure test. However, the
supplied <cpuid.h> does not contain the bit_AVX2 define that we use
when detecting whether the routine can be enabled.
Introduce a qemu-specific header that uses the compiler's definition
of __cpuid et al, but supplies any missing bit_* definitions needed.
This avoids introducing any extra ifdefs to util/bufferiszero.c, and
allows quite a few to be removed from tcg/i386/tcg-target.inc.c.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20170719044018.18063-1-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
I expect the 'null' type to be useful mostly for members of alternate
types.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJZbg1XAAoJEH8JsnLIjy/Wdj8QAJm5SE8cOwonEDRd9AV5n0Eg
xoHLFEEKjqBJ8oDHHn7huVbNdHN693vM2ro2Exxx8ZCTSdIkvSeVmEOrzb76sWOe
QRbCTWUKbMD6wjCNF5tqPsvmk+ZkHMqYhyVcRAaIpd+IcEECA16ot/fhRa6Ec/bk
8GHzDSxkVq5wFgoEJ09hGEE7GY2uGdV1HEJK7xq+Vittx8LV3QMnlH4KvZ9VzYfe
BnNsmK5vMNlsHTfWfQXsB+sxb+aGGr5v45e4XfctTxGx08ajMC50WnYZUezySERJ
TXimHOiJNHMpapfU7focLuapwMm6AxpQAh5QzxTBgaqW7eeX3P16DWx4m/WfRL7v
AuyM4U3TdH0vYZPGlQ5pAlScmeZh+GRBRiDkJf/04q7hH2Hgt85+8gyef7FF/Qta
KT49tBr64eA89ZUDVFBCkukyYWKWTDSNrGJjB6gMqh7cI6gI55uLdXB/nF4vCgJu
YfYTdaF/1GJm22HtAg3O5fctRDh14rkBgi5jPhifaT7pP0zZm0JBxGlpXUWkg3RA
NIhZ2fJ2/FasS7/5IsUjJbYuI52CTLmNXQIRt/ZHekzQkgk1VPrnJls0ibCdG8NF
4z90uIG7bUuEIjZWKogB+gyH9MMtG3qlfZ0RjmXq2FfWCNhBmfezcGOx+Jnf+XDb
IMPzzwu77XVcduj4XxKL
=2c6k
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
# gpg: Signature made Tue 18 Jul 2017 14:29:59 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (21 commits)
qemu-img: Check for backing image if specified during create
blockdev: move BDRV_O_NO_BACKING option forward
block/vvfat: Fix compiler warning with gcc 7
vvfat: initialize memory after allocating it
vvfat: correctly parse non-ASCII short and long file names
vvfat: add a constant for bootsector name
vvfat: add constants for special values of name[0]
qemu-iotests: Test unplug of -device without drive
qemu-iotests: Test 'info block'
scsi-disk: bdrv_attach_dev() for empty CD-ROM
ide: bdrv_attach_dev() for empty CD-ROM
block: List anonymous device BBs in query-block
block/qapi: Use blk_all_next() for query-block
block: Make blk_all_next() public
block/qapi: Add qdev device name to query-block
block: Make blk_get_attached_dev_id() public
block/vpc.c: Handle write failures in get_image_offset()
block/vmdk: Report failures in vmdk_read_cid()
block: remove timer canceling in throttle_config()
block: add clock_type field to ThrottleGroup
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
throttle_config() cancels the timers of the calling BlockBackend. This
doesn't make sense because other BlockBackends in the group remain
untouched. There's no need to cancel the timers in the one specific
BlockBackend so let's not do that. Throttled requests will run as
scheduled and future requests will follow the new configuration. This
also allows a throttle group's configuration to be changed even when it
has no members.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Clock type in throttling is currently inferred by the ThrottleTimer's
clock type even though it is a per-ThrottleGroup property; it doesn't
make sense to have different clock types in the same group. Moving this
to a field in ThrottleGroup can simplify some of the throttle functions.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
By exposing FWCfgIoState and FWCfgMemState internals we allow the possibility
for the internal MemoryRegion fields to be mapped by name for boards that wish
to wire up the fw_cfg device themselves.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-Id: <1500025208-14827-4-git-send-email-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>