Add a mutex to protect the SIGBUS case, as we cannot mess concurrently
with the sigbus handler and we have to manage the global variable
sigbus_memset_context. The MADV_POPULATE_WRITE path can run
concurrently.
Note that page_mutex and page_cond are shared between concurrent
invocations, which shouldn't be a problem.
This is a preparation for future virtio-mem prealloc code, which will call
os_mem_prealloc() asynchronously from an iothread when handling guest
requests.
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-7-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's simplify the case when we only want a single thread and don't have
to mess with signal handlers.
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-6-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's limit the number of threads to something sane, especially that
- We don't have more threads than the number of pages we have
- We don't have threads that initialize small (< 64 MiB) memory
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-5-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's minimize the number of global variables to prepare for
os_mem_prealloc() getting called concurrently and make the code a bit
easier to read.
The only consumer that really needs a global variable is the sigbus
handler, which will require protection via a mutex in the future either way
as we cannot concurrently mess with the SIGBUS handler.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's sense support and use it for preallocation. MADV_POPULATE_WRITE
does not require a SIGBUS handler, doesn't actually touch page content,
and avoids context switches; it is, therefore, faster and easier to handle
than our current approach.
While MADV_POPULATE_WRITE is, in general, faster than manual
prefaulting, and especially faster with 4k pages, there is still value in
prefaulting using multiple threads to speed up preallocation.
More details on MADV_POPULATE_WRITE can be found in the Linux commits
4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault
page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for
MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1].
This resolves the TODO in do_touch_pages().
In the future, we might want to look into using fallocate(), eventually
combined with MADV_POPULATE_READ, when dealing with shared file/fd
mappings and not caring about memory bindings.
[1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com
Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's prepare touch_all_pages() for returning differing errors. Return
an error from the thread and report the last processed error.
Translate SIGBUS to -EFAULT, as a SIGBUS can mean all different kind of
things (memory error, read error, out of memory). When allocating memory
fails via the current SIGBUS-based mechanism, we'll get:
os_mem_prealloc: preallocating memory failed: Bad address
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Invoke the transaction drivers' .clean() methods only after all
.commit() or .abort() handlers are done.
This makes it easier to have nested transactions where the top-level
transactions pass objects to lower transactions that the latter can
still use throughout their commit/abort phases, while the top-level
transaction keeps a reference that is released in its .clean() method.
(Before this commit, that is also possible, but the top-level
transaction would need to take care to invoke tran_add() before the
lower-level transaction does. This commit makes the ordering
irrelevant, which is just a bit nicer.)
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20211111120829.81329-8-hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211115145409.176785-8-kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
The drain_rcu_call() function can be blocked as long as an RCU reader
stays in a read-side critical section. This is typically what happens
when a TCG vCPU is executing a busy loop. It can deadlock the QEMU
monitor as reported in https://gitlab.com/qemu-project/qemu/-/issues/650 .
This can be avoided by allowing drain_rcu_call() to enforce an RCU grace
period. Since each reader might need to do specific actions to end a
read-side critical section, do it with notifiers.
Prepare ground for this by adding a notifier list to the RCU reader
struct and use it in wait_for_readers() if drain_rcu_call() is in
progress. An API is added for readers to register their notifiers.
This is largely based on a draft from Paolo Bonzini.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211109183523.47726-2-groug@kaod.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As qemu guidelines:
Unless a pointer is used to modify the pointed-to storage, give it the
"const" attribute.
In the particular case of iova_tree_find it allows to enforce what is
requested by its comment, since the compiler would shout in case of
modifying or freeing the const-qualified returned pointer.
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211013182713.888753-2-eperezma@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These will be used to implement new decimal floating point
instructions from Power ISA 3.1.
The remainder is now returned directly by divu128/divs128,
freeing up phigh to receive the high 64 bits of the quotient.
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211025191154.350831-4-luis.pires@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In preparation for changing the divu128/divs128 implementations
to allow for quotients larger than 64 bits, move the div-by-zero
and overflow checks to the callers.
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211025191154.350831-2-luis.pires@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use QTAILQ_FOREACH_SAFE() so that the current QemuOpts can be deleted
while iterating through the whole list.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211008133442.141332-11-kwolf@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This makes the pthreads check dead in configure, so remove it
as well.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20211007130829.632254-9-pbonzini@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This allows the use of native signalfd instead of the sigtimedwait
based emulation on systems other than Linux.
Signed-off-by: Kacper Słomiński <kacper.slominski72@gmail.com>
Message-Id: <20210905011621.200785-1-kacper.slominski72@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The QAPI schema shouldn't rely on C system headers #define, but on
configure-time project #define, so we can express the build condition in
a C-independent way.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210907121943.3498701-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The previous code didn't detect overflows if the high 64-bit
of the dividend were equal to the 64-bit divisor. In that case,
64 bits wouldn't be enough to hold the quotient.
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210910112624.72748-2-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Simple unions predate flat unions. Having both complicates the QAPI
schema language and the QAPI generator. We haven't been using simple
unions in new code for a long time, because they are less flexible and
somewhat awkward on the wire.
To prepare for their removal, convert simple union SocketAddressLegacy
to an equivalent flat one, with existing enum SocketAddressType
replacing implicit enum type SocketAddressLegacyKind. Adds some
boilerplate to the schema, which is a bit ugly, but a lot easier to
maintain than the simple union feature.
Cc: "Daniel P. Berrangé" <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210917143134.412106-9-armbru@redhat.com>
As we can see from the following function call stack, amaster and aslave
can not be NULL: char_pty_open() -> qemu_openpty_raw() -> openpty().
In addition, according to the API specification for openpty():
https://www.gnu.org/software/libc/manual/html_node/Pseudo_002dTerminal-Pairs.html,
the arguments name, termp and winp can all be NULL, but arguments amaster or aslave
can not be NULL.
Finally, amaster and aslave has been dereferenced at the beginning of the openpty().
So the checks on amaster and aslave in the openpty() are redundant. Remove them.
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Alex Chen <alex.chen@huawei.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <5F9FE5B8.1030803@huawei.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This seems to be either a glibc or gcc bug, but the code
appears to be fine with the warning suppressed.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210803211907.150525-1-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pass qemu_vfio_do_mapping() an Error* argument so it can propagate
any error to callers. Replace error_report() which only report
to the monitor by the more generic error_setg_errno().
Reviewed-by: Fam Zheng <fam@euphon.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210902070025.197072-11-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Both qemu_vfio_find_fixed_iova() and qemu_vfio_find_temp_iova()
return an errno which is unused (or overwritten). Have them propagate
eventual errors to callers, returning a boolean (which is what the
Error API recommends, see commit e3fe3988d7 "error: Document Error
API usage rules" for rationale).
Suggested-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210902070025.197072-9-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Extract qemu_vfio_water_mark_reached() for readability,
and have it provide an error hint it its Error* handle.
Suggested-by: Klaus Jensen <k.jensen@samsung.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210902070025.197072-8-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently qemu_vfio_dma_map() displays errors on stderr.
When using management interface, this information is simply
lost. Pass qemu_vfio_dma_map() an Error** handle so it can
propagate the error to callers.
Reviewed-by: Fam Zheng <fam@euphon.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210902070025.197072-7-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qemu_vfio_add_mapping() returns a pointer to an indexed entry
in pre-allocated QEMUVFIOState::mappings[], thus can not be NULL.
Remove the pointless check.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210902070025.197072-5-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Simplify qemu_vfio_dma_[un]map() handlers by replacing a pair of
qemu_mutex_lock/qemu_mutex_unlock calls by the WITH_QEMU_LOCK_GUARD
macro.
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210902070025.197072-4-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Instead of displaying the error on stderr, use error_report()
which also report to the monitor.
Reviewed-by: Fam Zheng <fam@euphon.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210902070025.197072-3-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit 4cfd970ec1 added an
assert which ensures the path within an address of a unix
socket returned from the kernel is at least one byte and
does not exceed sun_path buffer. Both of this constraints
are wrong:
A unix socket can be unnamed, in this case the path is
completely empty (not even \0)
And some implementations (notable linux) can add extra
trailing byte (\0) _after_ the sun_path buffer if we
passed buffer larger than it (and we do).
So remove the assertion (since it causes real-life breakage)
but at the same time fix the usage of sun_path. Namely,
we should not access sun_path[0] if kernel did not return
it at all (this is the case for unnamed sockets),
and use the returned salen when copyig actual path as an
upper constraint for the amount of bytes to copy - this
will ensure we wont exceed the information provided by
the kernel, regardless whenever there is a trailing \0
or not. This also helps with unnamed sockets.
Note the case of abstract socket, the sun_path is actually
a blob and can contain \0 characters, - it should not be
passed to g_strndup and the like, it should be accessed by
memcpy-like functions.
Fixes: 4cfd970ec1
Fixes: http://bugs.debian.org/993145
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
CC: qemu-stable@nongnu.org
Commit 776b97d360 "qemu-sockets: add abstract UNIX domain socket
support" neglected to update socket_sockaddr_to_address_unix() and
copied the whole sun_path without taking "salen" into account.
Later, commit 3b14b4ec49 "sockets: Fix socket_sockaddr_to_address_unix()
for abstract sockets" handled the abstract UNIX path, by stripping the
leading \0 character and fixing address details, but didn't use salen
either.
Not taking "salen" into account may result in incorrect "path" being
returned in monitors commands, as we read past the address which is not
necessarily \0-terminated.
Fixes: 776b97d360
Fixes: 3b14b4ec49
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
From clang-13:
util/selfmap.c:26:21: error: variable 'errors' set but not used \
[-Werror,-Wunused-but-set-variable]
Quite right of course, but there's no reason not to check errors.
First, incrementing errors is incorrect, because qemu_strtoul
returns an errno not a count -- just or them together so that
we have a non-zero value at the end.
Second, if we have an error, do not add the struct to the list,
but free it instead.
Cc: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Commit d8fb7d0969 ("vl: switch -M parsing
to keyval") stopped adding the "machine" QemuOptsList. This causes
"machine" options to not show up in QMP query-command-line-options
output. For example, libvirt cannot detect that kernel_irqchip support
is available.
Adjust the "machine" opts enumeration in
qmp_query_command_line_options() so that options are properly reported.
Fixes: d8fb7d0969 ("vl: switch -M parsing to keyval")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210721151055.424580-1-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use it to avoid some clang-12 -Watomic-alignment errors,
forcing some structures to be aligned and as a pointer when
we have ensured that the address is aligned.
Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The `aio-max-batch` parameter will be propagated to AIO engines
and it will be used to control the maximum number of queued requests.
When there are in queue a number of requests equal to `aio-max-batch`,
the engine invokes the system call to forward the requests to the kernel.
This parameter allows us to control the maximum batch size to reduce
the latency that requests might accumulate while queued in the AIO
engine queue.
If `aio-max-batch` is equal to 0 (default value), the AIO engine will
use its default maximum batch size value.
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-id: 20210721094211.69853-3-sgarzare@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The leak is basically impossible to reach, since the only common way
to get ferror(fp) is by passing a directory to -readconfig. In that
case, the error occurs before qdict is set to anything non-NULL.
However, it's theoretically possible to get there after an EIO.
Cc: armbru@redhat.com
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Fixes: f7544edcd3 ("qemu-config: add error propagation to qemu_config_parse", 2021-03-06)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ensure that the callback to qemu_config_foreach is never called upon
an error, by moving the invocation before the "out" label.
Cc: armbru@redhat.com
Fixes: 3770141139 ("qemu-config: parse configuration files to a QDict", 2021-06-04)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Make blockdev-reopen stable
- Remove deprecated qemu-img backing file without format
- rbd: Convert to coroutines and add write zeroes support
- rbd: Updated MAINTAINERS
- export/fuse: Allow other users access to the export
- vhost-user: Fix backends without multiqueue support
- Fix drive-backup transaction endless drained section
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE3D3rFZqa+V09dFb+fwmycsiPL9YFAmDoRdIRHGt3b2xmQHJl
ZGhhdC5jb20ACgkQfwmycsiPL9bvgQ/+Ogq24n1UOQc8FEKRYfyhajNToQ9ofzWN
iLiblSGx2QDq+CauD3qdu6z7DLlqEXeoM4NYM462oIPumptQj+9XZt7ftfh6FLWW
4yJEbjfnVKOba+vFdJ+E0DStwnPaxYdnrPGd53cwHZfbZh4ZmkpTM350mzHHiLTb
KYKOgWd+UHZbkYeCVNYTGe30SRBiKeAecTpsVZ5HVhe7LstjByuy5stk8dytLpdV
YqdKOToZfOp77XiHr8YcLLp1HHBGlr5hw73V4SDas0beCp7hqtnAqsTYyXBue4xO
4zfD4Gujr5JVOCb0crDTyOmOQY5E+y2dqFoOUF00D5AoN2vj4nfQ9ESkbqlE9BVh
mgJ1izSokYlN2X8rIwGXNR5fbxRmxxfkAA4rScNRytj1KxDHyrDxrp/k8YFemxSQ
qwgb/FBm0fcr69evPRzovKwZFhcyPremksluHQE4rZZ66qBQ2cGuDJPE7PWVTpPH
67JCrIVK/O6n5p+4ilFHmQQ3aP3ol0frMFcboYVRchJ2MhIDTsfFL3F/tTK8hy86
AmrrdQ1BQIAoKNOKnAmOSOUdExM55OcfPmX69+AhEk2GeWP6kgz5Pks4H3qCiKGf
YoRk8F1V+N4q+C0mFFovB61bNQ6COIlBuzmD9EtmpDD/Ta3Wib+3ZnoGVIdPS+OI
jyj+qJxd9z4=
=kH+r
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches
- Make blockdev-reopen stable
- Remove deprecated qemu-img backing file without format
- rbd: Convert to coroutines and add write zeroes support
- rbd: Updated MAINTAINERS
- export/fuse: Allow other users access to the export
- vhost-user: Fix backends without multiqueue support
- Fix drive-backup transaction endless drained section
# gpg: Signature made Fri 09 Jul 2021 13:49:22 BST
# gpg: using RSA key DC3DEB159A9AF95D3D7456FE7F09B272C88F2FD6
# gpg: issuer "kwolf@redhat.com"
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (28 commits)
block: Make blockdev-reopen stable API
iotests: Test reopening multiple devices at the same time
block: Support multiple reopening with x-blockdev-reopen
block: Acquire AioContexts during bdrv_reopen_multiple()
block: Add bdrv_reopen_queue_free()
qcow2: Fix dangling pointer after reopen for 'file'
qemu-img: Improve error for rebase without backing format
qemu-img: Require -F with -b backing image
qcow2: Prohibit backing file changes in 'qemu-img amend'
blockdev: fix drive-backup transaction endless drained section
vhost-user: Fix backends without multiqueue support
MAINTAINERS: add block/rbd.c reviewer
block/rbd: fix type of task->complete
iotests/fuse-allow-other: Test allow-other
iotests/308: Test +w on read-only FUSE exports
export/fuse: Let permissions be adjustable
export/fuse: Give SET_ATTR_SIZE its own branch
export/fuse: Add allow-other option
export/fuse: Pass default_permissions for mount
util/uri: do not check argument of uri_free()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We know that in the body of this if statement i is less than len, so
we really should be copying len - i bytes not i - len bytes.
Fix this typo.
Fixes: 8d8404f156 ("util: Add qemu_guest_getrandom and associated routines")
Signed-off-by: Mark Nelson <mdnelson8@gmail.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210709120600.11080-1-mdnelson8@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
With target-specific modules we can have multiple modules implementing
the same object. Therefore we have to check the target arch on lookup
to find the correct module.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-20-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add module_allow_arch() to set the target architecture.
In case a module is limited to some arch verify arches
match and ignore the module if not.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-19-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
One for module load and one for qom type lookup.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-18-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use module database to figure which module adds given QemuOpts group.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-17-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use module database to figure which module implements a given QOM type.
Drop hard-coded object list.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-16-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use module database for module dependencies.
Drop hard-coded dependency list.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-15-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add script to generate C source with a small
database containing the module meta-data.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-4-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While most libraries do not need a CONFIG_* symbol because the
"when:" clauses are enough, some do. Add them back or stop
using them if possible.
In the case of libpmem, the statement to add the CONFIG_* symbol
was still in configure, but could not be triggered because it
checked for "no" instead of "disabled" (and it would be wrong anyway
since the test for the library has not been done yet).
Reported-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Fixes: 587d59d6cc ("configure, meson: convert virgl detection to meson", 2021-07-06)
Fixes: 83ef16821a ("configure, meson: convert libdaxctl detection to meson", 2021-07-06)
Fixes: e36e8c70f6 ("configure, meson: convert libpmem detection to meson", 2021-07-06)
Fixes: 53c22b68e3 ("configure, meson: convert liburing detection to meson", 2021-07-06)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
uri_free() checks if its argument is NULL in uri_clean() and g_free().
There is no need to check the argument before the call.
Signed-off-by: Heinrich Schuchardt <xypron.glpk@gmx.de>
Message-Id: <20210629063602.4239-1-xypron.glpk@gmx.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It is not safe to pretend that emulated NVDIMM supports
persistence while backend actually failed to enable it
and used non-persistent mapping as fall back.
Instead of falling-back, QEMU should be more strict and
error out with clear message that it's not supported.
So if user asks for persistence (pmem=on), they should
store backing file on NVDIMM.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210111203332.740815-1-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
-M was the sole user of qemu_opts_set and qemu_opts_set_defaults,
remove them and the arguments that they used.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow parsing multiple keyval sequences into the same dictionary.
This will be used to simplify the parsing of the -M command line
option, which is currently a .merge_lists = true QemuOpts group.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch introduces a function that merges two keyval-produced
(or keyval-like) QDicts. It can be used to emulate the behavior of
.merge_lists = true QemuOpts groups, merging -readconfig sections and
command-line options in a single QDict, and also to implement -set.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
BHs must be deleted before the AioContext is finalized. If not, it's a
bug and probably indicates that some part of the program still expects
the BH to run in the future. That can lead to memory leaks, inconsistent
state, or just hangs.
Unfortunately the assert(flags & BH_DELETED) call in aio_ctx_finalize()
is difficult to debug because the assertion failure contains no
information about the BH!
Use the QEMUBH name field added in the previous patch to show a useful
error when a leaked BH is detected.
Suggested-by: Eric Ernst <eric.g.ernst@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210414200247.917496-3-stefanha@redhat.com>
It can be difficult to debug issues with BHs in production environments.
Although BHs can usually be identified by looking up their ->cb()
function pointer, this requires debug information for the program. It is
also not possible to print human-readable diagnostics about BHs because
they have no identifier.
This patch adds a name to each BH. The name is not unique per instance
but differentiates between cb() functions, which is usually enough. It's
done by changing aio_bh_new() and friends to macros that stringify cb.
The next patch will use the name field when reporting leaked BHs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210414200247.917496-2-stefanha@redhat.com>
co-shared-resource is currently not thread-safe, as also reported
in co-shared-resource.h. Add a QemuMutex because co_try_get_from_shres
can also be invoked from non-coroutine context.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20210614081130.22134-6-eesposit@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
The function is called with alignment == 0 which caused an assertion.
Use the code from oslib-posix.c to fix that regression.
Fixes: ed6f53f9ca
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210611105846.347954-1-sw@weilnetz.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Add function that transforms named fd inside SocketAddress structure
into number representation. This way it may be then used in a context
where current monitor is not available.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20210610100802.5888-6-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: comment tweak]
Signed-off-by: Eric Blake <eblake@redhat.com>
If we want to wake up a coroutine from a worker thread, aio_co_wake()
currently does not work. In that scenario, aio_co_wake() calls
aio_co_enter(), but there is no current AioContext and therefore
qemu_get_current_aio_context() returns the main thread. aio_co_wake()
then attempts to call aio_context_acquire() instead of going through
aio_co_schedule().
The default case of qemu_get_current_aio_context() was added to cover
synchronous I/O started from the vCPU thread, but the main and vCPU
threads are quite different. The main thread is an I/O thread itself,
only running a more complicated event loop; the vCPU thread instead
is essentially a worker thread that occasionally calls
qemu_mutex_lock_iothread(). It is only in those critical sections
that it acts as if it were the home thread of the main AioContext.
Therefore, this patch detaches qemu_get_current_aio_context() from
iothreads, which is a useless complication. The AioContext pointer
is stored directly in the thread-local variable, including for the
main loop. Worker threads (including vCPU threads) optionally behave
as temporary home threads if they have taken the big QEMU lock,
but if that is not the case they will always schedule coroutines
on remote threads via aio_co_schedule().
With this change, the stub qemu_mutex_iothread_locked() must be changed
from true to false. The previous value of true was needed because the
main thread did not have an AioContext in the thread-local variable,
but now it does have one.
Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210609122234.544153-1-pbonzini@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[eblake: tweak commit message per Vladimir's review]
Signed-off-by: Eric Blake <eblake@redhat.com>
We will shortly convert lockable.h to _Generic, and we cannot
have two compatible types in the same expansion. Wrap QemuMutex
in a struct, and unwrap in qemu-thread-posix.c.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210614233143.1221879-6-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Create macros for file+line expansion in qemu_rec_mutex_unlock
like we have for qemu_mutex_unlock.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210614233143.1221879-5-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the declarations from thread-win32.h into thread.h
and remove the macro redirection from thread-posix.h.
This will be required by following cleanups.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210614233143.1221879-4-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's support RAM_NORESERVE via MAP_NORESERVE on Linux. The flag has no
effect on most shared mappings - except for hugetlbfs and anonymous memory.
Linux man page:
"MAP_NORESERVE: Do not reserve swap space for this mapping. When swap
space is reserved, one has the guarantee that it is possible to modify
the mapping. When swap space is not reserved one might get SIGSEGV
upon a write if no physical memory is available. See also the discussion
of the file /proc/sys/vm/overcommit_memory in proc(5). In kernels before
2.6, this flag had effect only for private writable mappings."
Note that the "guarantee" part is wrong with memory overcommit in Linux.
Also, in Linux hugetlbfs is treated differently - we configure reservation
of huge pages from the pool, not reservation of swap space (huge pages
cannot be swapped).
The rough behavior is [1]:
a) !Hugetlbfs:
1) Without MAP_NORESERVE *or* with memory overcommit under Linux
disabled ("/proc/sys/vm/overcommit_memory == 2"), the following
accounting/reservation happens:
For a file backed map
SHARED or READ-only - 0 cost (the file is the map not swap)
PRIVATE WRITABLE - size of mapping per instance
For an anonymous or /dev/zero map
SHARED - size of mapping
PRIVATE READ-only - 0 cost (but of little use)
PRIVATE WRITABLE - size of mapping per instance
2) With MAP_NORESERVE, no accounting/reservation happens.
b) Hugetlbfs:
1) Without MAP_NORESERVE, huge pages are reserved.
2) With MAP_NORESERVE, no huge pages are reserved.
Note: With "/proc/sys/vm/overcommit_memory == 0", we were already able
to configure it for !hugetlbfs globally; this toggle now allows
configuring it more fine-grained, not for the whole system.
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.
[1] https://www.kernel.org/doc/Documentation/vm/overcommit-accounting
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-10-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's introduce RAM_NORESERVE, allowing mmap'ing with MAP_NORESERVE. The
new flag has the following semantics:
"
RAM is mmap-ed with MAP_NORESERVE. When set, reserving swap space (or huge
pages if applicable) is skipped: will bail out if not supported. When not
set, the OS will do the reservation, if supported for the memory type.
"
Allow passing it into:
- memory_region_init_ram_nomigrate()
- memory_region_init_resizeable_ram()
- memory_region_init_ram_from_file()
... and teach qemu_ram_mmap() and qemu_anon_ram_alloc() about the flag.
Bail out if the flag is not supported, which is the case right now for
both, POSIX and win32. We will add Linux support next and allow specifying
RAM_NORESERVE via memory backends.
The target use case is virtio-mem, which dynamically exposes memory
inside a large, sparse memory area to the VM.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-9-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's pass flags instead of bools to prepare for passing other flags and
update the documentation of qemu_ram_mmap(). Introduce new QEMU_MAP_
flags that abstract the mmap() PROT_ and MAP_ flag handling and simplify
it.
We expose only flags that are currently supported by qemu_ram_mmap().
Maybe, we'll see qemu_mmap() in the future as well that can implement these
flags.
Note: We don't use MAP_ flags as some flags (e.g., MAP_SYNC) are only
defined for some systems and we want to always be able to identify
these flags reliably inside qemu_ram_mmap() -- for example, to properly
warn when some future flags are not available or effective on a system.
Also, this way we can simplify PROT_ handling as well.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-8-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We want to activate memory within a reserved memory region, to make it
accessible. Let's factor that out.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-4-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We want to reserve a memory region without actually populating memory.
Let's factor that out.
Reviewed-by: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
Acked-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-3-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's factor out calculating the size of the guard page and rename the
variable to make it clearer that this pagesize only applies to the
guard page.
Reviewed-by: Peter Xu <peterx@redhat.com>
Acked-by: Murilo Opsfelder Araujo <muriloo@linux.ibm.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com> for memory backend and machine core
Cc: Igor Kotrasinski <i.kotrasinsk@partner.samsung.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210510114328.21835-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using qemu_opts_absorb_qdict, and then checking for any leftover options,
is redundant because there is already a function that does the same,
qemu_opts_from_qdict. qemu_opts_from_qdict consumes the whole dictionary
and therefore can just return an error message if an option fails to validate.
This also fixes a bug, because the "id" entry was retrieved in
qemu_config_do_parse and then left there by qemu_opts_absorb_qdict.
As a result, it was reported as an unrecognized option.
Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Fixes: 3770141139 ("qemu-config: parse configuration files to a QDict")
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For --enable-tcg-interpreter on Windows, we will need this.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Multipath TCP allows combining multiple interfaces/routes into a single
socket, with very little work for the user/admin.
It's enabled by 'mptcp' on most socket addresses:
./qemu-system-x86_64 -nographic -incoming tcp:0:4444,mptcp
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210421112834.107651-6-dgilbert@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Change the parser to put the values into a QDict and pass them
to a callback. qemu_config_parse's QemuOpts creation is
itself turned into a callback function.
This is useful for -readconfig to support keyval-based options;
getting a QDict from the parser removes a roundtrip from
QDict to QemuOpts and then back to QDict.
Unfortunately there is a disadvantage in that semantic errors will
point to the last line of the group, because the entries of the QDict
do not have a location attached.
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210524105752.3318299-2-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
OpenBSD prior to 6.3 required a workaround to utilize fcntl(F_SETFL) on memory
devices.
Since modern verions of OpenBSD that are only officialy supported and buildable
on do not have this issue I am garbage collecting this workaround.
Signed-off-by: Brad Smith <brad@comstyle.com>
Message-Id: <YGYECGXQhdamEJgC@humpty.home.comstyle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The glib version was not previously constrained by RHEL-7 since it
rebases fairly often. Instead SLES 12 and Ubuntu 16.04 were the
constraints in 00f2cfbbec. Both of
these are old enough that they are outside our platform support
matrix now.
Per repology, current shipping versions are:
RHEL-8: 2.56.4
Debian Buster: 2.58.3
openSUSE Leap 15.2: 2.62.6
Ubuntu LTS 18.04: 2.56.4
Ubuntu LTS 20.04: 2.64.6
FreeBSD: 2.66.7
Fedora 33: 2.66.8
Fedora 34: 2.68.1
OpenBSD: 2.68.1
macOS HomeBrew: 2.68.1
Thus Ubuntu LTS 18.04 / RHEL-8 are the constraint for GLib version
at 2.56
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210514120415.1368922-11-berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Willian Rampazzo <willianr@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Commit e50caf4a5c ("tracing: convert documentation to rST")
converted docs/devel/tracing.txt to docs/devel/tracing.rst.
We still have several references to the old file, so let's fix them
with the following command:
sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt)
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210517151702.109066-2-sgarzare@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Right now the SPICE module is special cased to be loaded when processing
of the -spice command line option. However, the spice option group
can also be brought in via -readconfig, in which case the module is
not loaded.
Add a generic hook to load modules that provide a QemuOpts group,
and use it for the "spice" and "iscsi" groups.
Fixes: #194
Fixes: https://bugs.launchpad.net/qemu/+bug/1910696
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow using QemuCoSleep to sleep forever until woken by qemu_co_sleep_wake.
This makes the logic of qemu_co_sleep_ns_wakeable easy to understand.
In the future we will introduce an API that can work even if the
sleep and wake happen from different threads. For now, initializing
w->to_wake after timer_mod is fine because the timer can only fire in
the same AioContext.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210517100548.28806-7-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Right now, users of qemu_co_sleep_ns_wakeable are simply passing
a pointer to QemuCoSleepState by reference to the function. But
QemuCoSleepState really is just a Coroutine*; making the
content of the struct public is just as efficient and lets us
skip the user_state_pointer indirection.
Since the usage is changed, take the occasion to rename the
struct to QemuCoSleep.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210517100548.28806-6-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This simplification is enabled by the previous patch. Now aio_co_wake
will only be called once, therefore we do not care about a spurious
firing of the timer after a qemu_co_sleep_wake.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210517100548.28806-5-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
All callers of qemu_co_sleep_wake are checking whether they are passing
a NULL argument inside the pointer-to-pointer: do the check in
qemu_co_sleep_wake itself.
As a side effect, qemu_co_sleep_wake can be called more than once and
it will only wake the coroutine once; after the first time, the argument
will be set to NULL via *sleep_state->user_state_pointer. However, this
would not be safe unless co_sleep_cb keeps using the QemuCoSleepState*
directly, so make it go through the pointer-to-pointer instead.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210517100548.28806-4-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Simplify the code by removing conditionals. qemu_co_sleep_ns
can simply point the argument to an on-stack temporary.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210517100548.28806-3-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The lifetime of the timer is well-known (it cannot outlive
qemu_co_sleep_ns_wakeable, because it's deleted by the time the
coroutine resumes), so it is not necessary to place it on the heap.
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210517100548.28806-2-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Replaced a call to malloc() and its respective call to free()
with g_malloc() and g_free().
g_malloc() is preferred more than g_try_* functions, which
return NULL on error, when the size of the requested
allocation is small. This is because allocating few
bytes should not be a problem in a healthy system.
Otherwise, the system is already in a critical state.
Subsequently, removed NULL-checking after g_malloc().
Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com>
Message-Id: <20210315105814.5188-3-ma.mandourr@gmail.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Ram block notifiers are currently not aware of resizes. To properly
handle resizes during migration, we want to teach ram block notifiers about
resizeable ram.
Introduce the basic infrastructure but keep using max_size in the
existing notifiers. Supply the max_size when adding and removing ram
blocks. Also, notify on resizes.
Acked-by: Paul Durrant <paul@xen.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: xen-devel@lists.xenproject.org
Cc: haxm-team@intel.com
Cc: Paul Durrant <paul@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Wenchao Wang <wenchao.wang@intel.com>
Cc: Colin Xu <colin.xu@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-3-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Factor it out into common code when a new notifier is registered, just
as done with the memory region notifier. This keeps logic about how to
process existing ram blocks at a central place.
Just like when adding a new ram block, we have to register the max_length.
Ram blocks are only "fake resized". All memory (max_length) is mapped.
Print the warning from inside qemu_vfio_ram_block_added().
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-2-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
get_relocated_path() allocates a GString object and returns the
character data (C string) to the caller without freeing the memory
allocated for that object as reported by valgrind:
24 bytes in 1 blocks are definitely lost in loss record 2,805 of 6,532
at 0x4839809: malloc (vg_replace_malloc.c:307)
by 0x55AABB8: g_malloc (in /usr/lib64/libglib-2.0.so.0.6600.8)
by 0x55C2481: g_slice_alloc (in /usr/lib64/libglib-2.0.so.0.6600.8)
by 0x55C4827: g_string_sized_new (in /usr/lib64/libglib-2.0.so.0.6600.8)
by 0x55C4CEA: g_string_new (in /usr/lib64/libglib-2.0.so.0.6600.8)
by 0x906314: get_relocated_path (cutils.c:1036)
by 0x6E1F77: qemu_read_default_config_file (vl.c:2122)
by 0x6E1F77: qemu_init (vl.c:2687)
by 0x3E3AF8: main (main.c:49)
Let's use g_string_free(gstring, false) to free only the GString object
and transfer the ownership of the character data to the caller.
Fixes: f4f5ed2cbd ("cutils: introduce get_relocated_path")
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210412170255.231406-1-sgarzare@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
* i386 page walk unification
* Fix detection of gdbus-codegen
* Misc refactoring
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmCblEEUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroObnQgAj10pRDODY9hIiUiYj2sTcEQTly3p
DC+ZWDaup67z3WV2C/vAS/x31RGIus+7bzji3fgtUcnGOr7sbuOCcs7CPY8mam5Y
GMPNrsUk2sZ5z9SVTq2vjEa61tjxtMpYXx9pnhgJzJAO4NJzNuX74ZdpA+oV5aTC
CvZDk8lC7BLU16MfeLcbw44xE4Oy05wWwaoP2pvhdOg47y85t/S9Il1yBCYi3y8C
pTOBBCYmHGPj/r7i4MhUGrAjIyGQu1w7av8nZXouRegoeVl28RKR8+pl7TfFpt+E
cp95yE8dPuNFJnCiZ3Kv01eBcSUcyp4gVb5H2Oa/nkkYRLpnONbUzYFtIA==
=zR5U
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging
* AccelCPUClass and sysemu/user split for i386 (Claudio)
* i386 page walk unification
* Fix detection of gdbus-codegen
* Misc refactoring
# gpg: Signature made Wed 12 May 2021 09:39:29 BST
# gpg: using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg: issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini-gitlab/tags/for-upstream: (32 commits)
coverity-scan: list components, move model to scripts/coverity-scan
configure: fix detection of gdbus-codegen
qemu-option: support accept-any QemuOptsList in qemu_opts_absorb_qdict
main-loop: remove dead code
target/i386: use mmu_translate for NPT walk
target/i386: allow customizing the next phase of the translation
target/i386: extend pg_mode to more CR0 and CR4 bits
target/i386: pass cr3 to mmu_translate
target/i386: extract mmu_translate
target/i386: move paging mode constants from SVM to cpu.h
target/i386: merge SVM_NPTEXIT_* with PF_ERROR_* constants
accel: add init_accel_cpu for adapting accel behavior to CPU type
accel: move call to accel_init_interfaces
i386: make cpu_load_efer sysemu-only
target/i386: gdbstub: only write CR0/CR2/CR3/EFER for sysemu
target/i386: gdbstub: introduce aux functions to read/write CS64 regs
i386: split off sysemu part of cpu.c
i386: split seg_helper into user-only and sysemu parts
i386: split svm_helper into sysemu and stub-only user
i386: separate fpu_helper sysemu-only parts
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
socket_get_fd() fails with the error "socket_get_fd: too many
connections" if the given listen backlog value is not 1.
Not all callers set the backlog to 1. For example, commit
582d4210eb ("qemu-nbd: Use SOMAXCONN for
socket listen() backlog") uses SOMAXCONN. This will always fail with in
socket_get_fd().
This patch calls listen(2) on the fd to update the backlog value. The
socket may already be in the listen state. I have tested that this works
on Linux 5.10 and macOS Catalina.
As a bonus this allows us to detect when the fd cannot listen. Now we'll
be able to catch unbound or connected fds in socket_listen().
Drop the num argument from socket_get_fd() since this function is also
called by socket_connect() where a listen backlog value does not make
sense.
Fixes: e5b6353cf2 ("socket: Add backlog parameter to socket_listen")
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Cc: Juan Quintela <quintela@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20210310173004.420190-1-stefanha@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
qemu_add_child_watch is not called anywhere since commit 2bdb920ece
("slirp: simplify fork_exec()", 2019-01-14), remove it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Just a skeleton for starters, following patches will add more code.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20210430113547.1816178-1-kraxel@redhat.com
Message-Id: <20210430113547.1816178-3-kraxel@redhat.com>
On Windows with glib <2.50, g_poll is redefined to use the variant
defined in util/oslib-win32.c. Use the same name in the declaration
and definition for ease of grepping.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Stop including sysemu/sysemu.h in files that don't need it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-2-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Add simple transaction API to use in further update of block graph
operations.
Supposed usage is:
- "prepare" is main function of the action and it should make the main
effect of the action to be visible for the following actions, keeping
possibility of roll-back, saving necessary things in action state,
which is prepended to the action list (to do that, prepare func
should call tran_add()). So, driver struct doesn't include "prepare"
field, as it is supposed to be called directly.
- commit/rollback is supposed to be called for the list of action
states, to commit/rollback all the actions in reverse order
- When possible "commit" should not make visible effect for other
actions, which make possible transparent logical interaction between
actions.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20210428151804.439460-9-vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The pseries machines introduced the concept of 'unplug timeout' for CPU
hotunplugs. The idea was to circunvent a deficiency in the pSeries
specification (PAPR), that currently does not define a proper way for
the hotunplug to fail. If the guest refuses to release the CPU (see [1]
for an example) there is no way for QEMU to detect the failure.
Further discussions about how to send a QAPI event to inform about the
hotunplug timeout [2] exposed problems that weren't predicted back when
the idea was developed. Other QEMU machines don't have any type of
hotunplug timeout mechanism for any device, e.g. ACPI based machines
have a way to make hotunplug errors visible to the hypervisor. This
would make this timeout mechanism exclusive to pSeries, which is not
ideal.
The real problem is that a QAPI event that reports hotunplug timeouts
puts the management layer (namely Libvirt) in a weird spot. We're not
telling that the hotunplug failed, because we can't be 100% sure of
that, and yet we're resetting the unplug state back, preventing any
DEVICE_DEL events to reach out in case the guest decides to release the
device. Libvirt would need to inspect the guest itself to see if the
device was released or not, otherwise the internal domain states will be
inconsistent. Moreover, Libvirt already has an 'unplug timeout'
concept, and a QEMU side timeout would need to be juggled together with
the existing Libvirt timeout.
All this considered, this solution ended up creating more trouble than
it solved. This patch reverts the 3 commits that introduced the timeout
mechanism for CPU hotplugs in pSeries machines.
This reverts commit 4515a5f786
"qemu_timer.c: add timer_deadline_ms() helper"
This reverts commit d1c2e3ce3d
"spapr_drc.c: add hotunplug timeout for CPUs"
This reverts commit 51254ffb32
"spapr_drc.c: introduce unplug_timeout_timer"
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1911414
[2] https://lists.gnu.org/archive/html/qemu-devel/2021-03/msg04682.html
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210401000437.131140-2-danielhb413@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
* Some small documentation updates
* Some small misc fixes
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmBlvO8RHHRodXRoQHJl
ZGhhdC5jb20ACgkQLtnXdP5wLbVwEBAAj2t78g5aGHITdV1y7c+frYF/nctKzYyj
OmYegJPCl8JoIdekdhADcDqSLGJ7EabirzGdmksmj1i1eAeRC5ASi06r4zJEc50Q
VE+sw2nWVPXcAQsRwpAIZ1Isq5zvfaRCOWbNU2/D1QR7XEQ0PZrhiW43zErnyhM/
pt0iZ0BSAc7VE7OsXIBLfOYJwpkTIgikX3B/Mrwo6Yjr/17pjnfz+zCzOo/9JeUq
rf6toiVc4LUIc/D62qWptSmrVYNSMJGbEWZmbmO30YDP4PSLI1c61KyXjLomia/V
6dGFFuQBIY6jIKPWCNsZ9khVxZX/fK4Er2X9tbj4zr+WH9sM87IGCMml+VKmua3C
U5r8n8zucdgtEBa0u+akOG8N3exRrgg4UNO5/uTLN3dBJONfYan+/hkitJfp7Oe0
5G6QM+d9CyOw1nzCf0DAzenhgvMjtREYBrw3fRHtXf5pl6hdpqJqUhfszIGburES
yyWAJKnnlBGTgqILLUX8ycsCLhDKkk7FvCyyifo7fieeFDuSQHUIrKaPQiVm0ter
Jo5wWdZh5LX5UUcg9Wyss3t6Xnr3qPTKtLAAcSROQFrSGaR7T2TAPW3wp3hpcEgo
NqisC7XbtwJTt1IEdNrnqgSpXuaus+on/NLTp5Nm9XgNTQY7aJeKmDjhMvv23H0j
DfW4QIxSdXY=
=nOMd
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/thuth-gitlab/tags/pull-request-2021-04-01' into staging
* Updates for the MAINTAINERS file
* Some small documentation updates
* Some small misc fixes
# gpg: Signature made Thu 01 Apr 2021 13:30:39 BST
# gpg: using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg: issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* remotes/thuth-gitlab/tags/pull-request-2021-04-01:
device-crash-test: Ignore errors about a bus not being available
docs: Fix typo in the default name of the qemu-system-x86_64 binary
docs: Remove obsolete paragraph about config-target.mak
util/compatfd.c: Fixed style issues
qom: Fix default values in help
MAINTAINERS: Mark SH-4 hardware emulation orphan
MAINTAINERS: Mark RX hardware emulation orphan
MAINTAINERS: add virtio-fs mailing list
MAINTAINERS: Drop the line with Xiang Zheng
MAINTAINERS: replace Huawei's email to personal one
MAINTAINERS: Drop the lines with Sarah Harris
MAINTAINERS: add/replace backups for some s390 areas
MAINTAINERS: Fix tests/migration maintainers
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
V2:
- "tests: Add tests for yank with the chardev-change case" updated
- drop the readthedoc theme patch
-----BEGIN PGP SIGNATURE-----
iQJQBAABCAA6FiEEh6m9kz+HxgbSdvYt2ujhCXWWnOUFAmBltIwcHG1hcmNhbmRy
ZS5sdXJlYXVAcmVkaGF0LmNvbQAKCRDa6OEJdZac5QzCEACOWNUlT5mTylY52sB4
RXxt+vFFby6aj/M5/tXv7T4EHShWkycV5kEnGjUKiWQXfHfHRfOur6PXkbTMG5zY
UBEuVAMWW50O2VQZR51W+kohZxsxNnimK2gnCTNGjDWOiofTFAcDf7Ycfxbg1TYU
fsO3m/dl9cy1fBgCsm64+61T60DC5W0JRsxoRCR1qr4vbJtXjoYe9i21GMWOr548
EVZo3XQDe5WYeTRyTpf1lHU0dLPrJqZuKmF6M3IQWXG7+ns7iMA0v/STmwsBwqSr
W6vygj2vPKAi1b1X1z/t/IGXP7mOtTZMUZWxhdOcxqEgYyP4rZji02U33CCd0fCi
wbD8VOmwvtqPeEHXu/b/dhpacgHis1w8jyJspAcW0MIpFJ+1mn+xtWnmMUlA2cOS
Vmgirinycsim9TKA+jS3vTwT+/wwzqtWUY267m09tVhJwxvGOXQH1i+mlRRLoNcs
2vf5iWanRbZgFJme8UYtqYB96pWIJjMa1FkMexJgK3VXgMA+Rjkr4MqIyuPoquyp
/3PgoUU1LUmGh8F+mi8m88tpdgad6iM+UWXeRALsP7UFvP1Psjz8f6Fhh8uBeE7E
wsdBsdTwwZ3zgLD4DxjpcZdLM+G7PT0nbeodnPWRuwebsYt3FymoCdmkS1CEn9ZT
kbQxdeJhTa7QoacZUmQSAoXO6g==
=UwXe
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/marcandre/tags/for-6.0-pull-request' into staging
For 6.0 misc patches under my radar.
V2:
- "tests: Add tests for yank with the chardev-change case" updated
- drop the readthedoc theme patch
# gpg: Signature made Thu 01 Apr 2021 12:54:52 BST
# gpg: using RSA key 87A9BD933F87C606D276F62DDAE8E10975969CE5
# gpg: issuer "marcandre.lureau@redhat.com"
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>" [full]
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>" [full]
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/marcandre/tags/for-6.0-pull-request:
tests: Add tests for yank with the chardev-change case
chardev: Fix yank with the chardev-change case
chardev/char.c: Always pass id to chardev_new
chardev/char.c: Move object_property_try_add_child out of chardev_new
yank: Always link full yank code
yank: Remove dependency on qiochannel
docs: simplify each section title
dbus-vmstate: Increase the size of input stream buffer used during load
util: fix use-after-free in module_load_one
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Yank now only depends on util and can be always linked in. Also remove
the stubs as they are not needed anymore.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <997aa12a28c555d8a3b7a363b3bda5c3cf1821ba.1616521341.git.lukasstraub2@web.de>
Remove dependency on qiochannel by removing yank_generic_iochannel and
letting migration and chardev use their own yank function for
iochannel.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20ff143fc2db23e27cd41d38043e481376c9cec1.1616521341.git.lukasstraub2@web.de>
g_hash_table_add always retains ownership of the pointer passed in as
the key. Its return status merely indicates whether the added entry was
new, or replaced an existing entry. Thus key must never be freed after
this method returns.
Spotted by ASAN:
==2407186==ERROR: AddressSanitizer: heap-use-after-free on address 0x6020003ac4f0 at pc 0x7ffff766659c bp 0x7fffffffd1d0 sp 0x7fffffffc980
READ of size 1 at 0x6020003ac4f0 thread T0
#0 0x7ffff766659b (/lib64/libasan.so.6+0x8a59b)
#1 0x7ffff6bfa843 in g_str_equal ../glib/ghash.c:2303
#2 0x7ffff6bf8167 in g_hash_table_lookup_node ../glib/ghash.c:493
#3 0x7ffff6bf9b78 in g_hash_table_insert_internal ../glib/ghash.c:1598
#4 0x7ffff6bf9c32 in g_hash_table_add ../glib/ghash.c:1689
#5 0x5555596caad4 in module_load_one ../util/module.c:233
#6 0x5555596ca949 in module_load_one ../util/module.c:225
#7 0x5555596ca949 in module_load_one ../util/module.c:225
#8 0x5555596cbdf4 in module_load_qom_all ../util/module.c:349
Typical C bug...
Fixes: 90629122d2 ("module: use g_hash_table_add()")
Cc: qemu-stable@nongnu.org
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20210316134456.3243102-1-marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
An invariant of the current rwlock is that if multiple coroutines hold a
reader lock, all must be runnable. The unlock implementation relies on
this, choosing to wake a single coroutine when the final read lock
holder exits the critical section, assuming that it will wake a
coroutine attempting to acquire a write lock.
The downgrade implementation violates this assumption by creating a
read lock owning coroutine that is exclusively runnable - any other
coroutines that are waiting to acquire a read lock are *not* made
runnable when the write lock holder converts its ownership to read
only.
More in general, the old implementation had lots of other fairness bugs.
The root cause of the bugs was that CoQueue would wake up readers even
if there were pending writers, and would wake up writers even if there
were readers. In that case, the coroutine would go back to sleep *at
the end* of the CoQueue, losing its place at the head of the line.
To fix this, keep the queue of waiters explicitly in the CoRwlock
instead of using CoQueue, and store for each whether it is a
potential reader or a writer. This way, downgrade can look at the
first queued coroutines and wake it only if it is a reader, causing
all other readers in line to be released in turn.
Reported-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: David Edmondson <david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210325112941.365238-5-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When taking the slow path for mutex acquisition, set the coroutine
value in the CoWaitRecord in push_waiter(), rather than both there and
in the caller.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20210325112941.365238-4-pbonzini@redhat.com
Message-Id: <20210309144015.557477-4-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Since the virtio-gpu-ccw device depends on the hw-display-virtio-gpu
module, which provides the type virtio-gpu-device, packaging the
hw-display-virtio-gpu module as a separate package that may or may not
be installed along with the qemu package leads to problems. Namely if
the hw-display-virtio-gpu is absent, qemu continues to advertise
virtio-gpu-ccw, but it aborts not only when one attempts using
virtio-gpu-ccw, but also when libvirtd's capability probing tries
to instantiate the type to introspect it.
Let us thus introduce a module named hw-s390x-virtio-gpu-ccw that
is going to provide the virtio-gpu-ccw device. The hw-s390x prefix
was chosen because it is not a portable device.
With virtio-gpu-ccw built as a module, the correct way to package a
modularized qemu is to require that hw-display-virtio-gpu must be
installed whenever the module hw-s390x-virtio-gpu-ccw.
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20210317095622.2839895-4-kraxel@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Mingw recognizes that "0x" has value 0 without setting errno, but
fails to advance endptr to the trailing garbage 'x'. This in turn
showed up in our recent testsuite additions for qemu_strtosz (commit
1657ba44b4 utils: Enhance testsuite for do_strtosz()); adjust our
remaining tests to show that we now work around this windows bug.
This patch intentionally fails check-syntax for use of strtol.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210317143325.2165821-3-eblake@redhat.com>
Message-Id: <20210323165308.15244-15-alex.bennee@linaro.org>
Our tests were not validating the return value in all cases, nor was
it guaranteeing our documented claim that 'res' is unchanged on error.
For that matter, it wasn't as thorough as the existing tests for
qemu_strtoi() and friends for proving that endptr and res are sanely
set. Enhancing the test found one case where we violated our
documentation: namely, when failing with EINVAL when endptr is NULL,
we shouldn't modify res.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210317143325.2165821-2-eblake@redhat.com>
Message-Id: <20210323165308.15244-14-alex.bennee@linaro.org>
Once we've parsed the fractional value, extract it into an integral
64-bit fraction. Perform the scaling with integer arithmetic, and
simplify the overflow detection.
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210315155835.1970210-2-richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Next batch of patches for the ppc target and machine types. Includes:
* Several cleanups for sm501 from Peter Maydell
* An update to the SLOF guest firmware
* Improved handling of hotplug failures in spapr, associated cleanups
to the hotplug handling code
* Several etsec fixes and cleanups from Bin Meng
* Assorted other fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAmBIRlUACgkQbDjKyiDZ
s5JGlxAApWKpxdtMwrxvQ7EX95XtDWY0v2Jpl3ZKLhYgWJ28pt1SfsDUlA9KhlDd
syXITpyspECe9kjOAKEim4J0y5sMVlTw8KjzIVPMik4uyoLTOBwE+nRmwPnmnWEy
9ZH0J+QOonQYh3jCp7JbTGU2ZW5pJ9s/sv8bPbzXfrR07HbAJ2+MjUkTVxkSVJAq
QUvo/jMntu+a1HFU8Eiw8VyyIcIOAQyS469xzUiHHzKFlR8XodE56Vj+oh6ZFtaA
cB2h4U51uzGfpz+GISm3lZUHSVnWQSFwLAc4x66aRsnLiQ66iAu8N0jRh8lsoW0y
FHF+uGp3AFUARHOiCRk0r7+s29gbu+lX2jogfddj+qj7mGIZXd2tMfrrG3eWsB2C
HvNby4xzyyDaguHK7N0/C42B8OX5dy2pxOP5lvdzL20ip97AKRGXngyM7LhYH8yw
4uzdebYVFu0KkLri4Qzxjm/GxgzrCbWIe5ImsDIlnmY1cJ7NKQYPzFX56xqq147y
6USFQu7RM9E03vj3c9UIkmK0KhL8GQvYxX4dMWIUjtjeLGJuN5seKBkl5mH2OSEJ
D9svKOanXmsZYS0A25VX9FRX263zbJ1HIkDmGzpLi7HULdRy78e89rJk6490WNDr
mnLogO+ttBvhEaLUsIVrWwLd21JW/A2NHuEz0+KELr9ZOQMYRj8=
=/uyx
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dg-gitlab/tags/ppc-for-6.0-20210310' into staging
ppc patch queue for 2021-03-10
Next batch of patches for the ppc target and machine types. Includes:
* Several cleanups for sm501 from Peter Maydell
* An update to the SLOF guest firmware
* Improved handling of hotplug failures in spapr, associated cleanups
to the hotplug handling code
* Several etsec fixes and cleanups from Bin Meng
* Assorted other fixes and cleanups
# gpg: Signature made Wed 10 Mar 2021 04:08:53 GMT
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dg-gitlab/tags/ppc-for-6.0-20210310:
spapr.c: send QAPI event when memory hotunplug fails
spapr.c: remove duplicated assert in spapr_memory_unplug_request()
target/ppc: fix icount support on Book-e vms accessing SPRs
qemu_timer.c: add timer_deadline_ms() helper
spapr_pci.c: add 'unplug already in progress' message for PCI unplug
spapr.c: add 'unplug already in progress' message for PHB unplug
hw/ppc: e500: Add missing <ranges> in the eTSEC node
hw/net: fsl_etsec: Fix build error when HEX_DUMP is on
spapr_drc.c: use DRC reconfiguration to cleanup DIMM unplug state
spapr_drc.c: add hotunplug timeout for CPUs
spapr_drc.c: introduce unplug_timeout_timer
target/ppc: Fix bcdsub. emulation when result overflows
docs/system: Extend PPC section
spapr: rename spapr_drc_detach() to spapr_drc_unplug_request()
spapr_drc.c: use spapr_drc_release() in isolate_physical/set_unusable
pseries: Update SLOF firmware image
spapr_drc.c: do not call spapr_drc_detach() in drc_isolate_logical()
hw/display/sm501: Inline template header into C file
hw/display/sm501: Expand out macros in template header
hw/display/sm501: Remove dead code for non-32-bit RGB surfaces
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
- Add Vladimir as NBD co-maintainer
- Fix reporting of holes in NBD_CMD_BLOCK_STATUS
- Improve command-line parsing accuracy of large numbers (anything going
through qemu_strtosz), including the deprecation of hex+suffix
- Improve some error reporting in the block layer
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEccLMIrHEYCkn0vOqp6FrSiUnQ2oFAmBHlmIACgkQp6FrSiUn
Q2q2cQgAqJWNb4J/ShjvzocDDPzJ0iBitFbg0huFPfbt4DScubEZo5wBJG7vOhOW
hIHrWCRzGvRgsn0tcSfrgFaegmHKrLgjkibM7ou8ni9NC1kUBd3R/3FBNIMxhYf7
Q8Kfspl0LRfMJDKF9jdCnQ4Gxcd6h2OIYZqiWVg8V4Tc8WdCpIVOah7e7wjuW8bT
vgZvfboUWm5AmIF9j/MxuMn+HFZ4ArSuFVL80ZaXlD00vRra7u3HZ8pUfcOlOujg
7HeouM1E5j3NNE6aZSN++x/EQ3sg0zmirbWUCcgAyRfdRkAmB15uh2PUzPxEIJKH
UHUIW5LvNtz2+yzOAz2yK29OE523Yg==
=blE1
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2021-03-09' into staging
nbd patches for 2021-03-09
- Add Vladimir as NBD co-maintainer
- Fix reporting of holes in NBD_CMD_BLOCK_STATUS
- Improve command-line parsing accuracy of large numbers (anything going
through qemu_strtosz), including the deprecation of hex+suffix
- Improve some error reporting in the block layer
# gpg: Signature made Tue 09 Mar 2021 15:38:10 GMT
# gpg: using RSA key 71C2CC22B1C4602927D2F3AAA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>" [full]
# gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>" [full]
# gpg: aka "[jpeg image of size 6874]" [full]
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-nbd-2021-03-09:
block/qcow2: refactor qcow2_update_options_prepare error paths
block/qed: bdrv_qed_do_open: deal with errp
block/qcow2: simplify qcow2_co_invalidate_cache()
block/qcow2: read_cache_sizes: return status value
block/qcow2-bitmap: return status from qcow2_store_persistent_dirty_bitmaps
block/qcow2-bitmap: improve qcow2_load_dirty_bitmaps() interface
block/qcow2: qcow2_get_specific_info(): drop error propagation
blockjob: return status from block_job_set_speed()
block/mirror: drop extra error propagation in commit_active_start()
block: drop extra error propagation for bdrv_set_backing_hd
blockdev: fix drive_backup_prepare() missed error
block: check return value of bdrv_open_child and drop error propagation
utils: Deprecate hex-with-suffix sizes
utils: Improve qemu_strtosz() to have 64 bits of precision
utils: Enhance testsuite for do_strtosz()
nbd: server: Report holes for raw images
MAINTAINERS: add Vladimir as co-maintainer of NBD
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The pSeries machine is using QEMUTimer internals to return the timeout
in seconds for a timer object, in hw/ppc/spapr.c, function
spapr_drc_unplug_timeout_remaining_sec().
Create a helper in qemu-timer.c to retrieve the deadline for a QEMUTimer
object, in ms, to avoid exposing timer internals to the PPC code.
CC: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20210301124133.23800-2-danielhb413@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We already got a global function called id_generate() to create unique
IDs within QEMU. Let's use it in the network subsytem, too, instead of
inventing our own ID scheme here.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210215090225.1046239-1-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
There are 23 files that include the "sysemu/qtest.h",
but they do not use any qtest functions.
Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210226081414.205946-1-kuhn.chenqun@huawei.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Supporting '0x20M' looks odd, particularly since we have a 'B' suffix
that is ambiguous for bytes, as well as a less-frequently-used 'E'
suffix for extremely large exibytes. In practice, people using hex
inputs are specifying values in bytes (and would have written
0x2000000, or possibly relied on default_suffix in the case of
qemu_strtosz_MiB), and the use of scaling suffixes makes the most
sense for inputs in decimal (where the user would write 32M). But
rather than outright dropping support for hex-with-suffix, let's
follow our deprecation policy. Sadly, since qemu_strtosz() does not
have an Err** parameter, and plumbing that in would be a much larger
task, we instead go with just directly emitting the deprecation
warning to stderr.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210211204438.1184395-4-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
We have multiple clients of qemu_strtosz (qemu-io, the opts visitor,
the keyval visitor), and it gets annoying that edge-case testing is
impacted by implicit rounding to 53 bits of precision due to parsing
with strtod(). As an example posted by Rich Jones:
$ nbdkit memory $(( 2**63 - 2**30 )) --run \
'build/qemu-io -f raw "$uri" -c "w -P 3 $(( 2**63 - 2**30 - 512 )) 512" '
write failed: Input/output error
because 9223372035781033472 got rounded to 0x7fffffffc0000000 which is
out of bounds.
It is also worth noting that our existing parser, by virtue of using
strtod(), accepts decimal AND hex numbers, even though test-cutils
previously lacked any coverage of the latter until the previous patch.
We do have existing clients that expect a hex parse to work (for
example, iotest 33 using qemu-io -c "write -P 0xa 0x200 0x400"), but
strtod() parses "08" as 8 rather than as an invalid octal number, so
we know there are no clients that depend on octal. Our use of
strtod() also means that "0x1.8k" would actually parse as 1536 (the
fraction is 8/16), rather than 1843 (if the fraction were 8/10); but
as this was not covered in the testsuite, I have no qualms forbidding
hex fractions as invalid, so this patch declares that the use of
fractions is only supported with decimal input, and enhances the
testsuite to document that.
Our previous use of strtod() meant that -1 parsed as a negative; now
that we parse with strtoull(), negative values can wrap around modulo
2^64, so we have to explicitly check whether the user passed in a '-';
and make it consistent to also reject '-0'. This has the minor effect
of treating negative values as EINVAL (with no change to endptr)
rather than ERANGE (with endptr advanced to what was parsed), visible
in the updated iotest output.
We also had no testsuite coverage of "1.1e0k", which happened to parse
under strtod() but is unlikely to occur in practice; as long as we are
making things more robust, it is easy enough to reject the use of
exponents in a strtod parse.
The fix is done by breaking the parse into an integer prefix (no loss
in precision), rejecting negative values (since we can no longer rely
on strtod() to do that), determining if a decimal or hexadecimal parse
was intended (with the new restriction that a fractional hex parse is
not allowed), and where appropriate, using a floating point fractional
parse (where we also scan to reject use of exponents in the fraction).
The bulk of the patch is then updates to the testsuite to match our
new precision, as well as adding new cases we reject (whether they
were rejected or inadvertently accepted before).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20210211204438.1184395-3-eblake@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
The "delay" option was a hack that was introduced to allow writing "nodelay".
We are adding a "nodelay" option to be used as "nodelay=on", so recommend it
instead of "delay".
This is quite ugly, but a proper deprecation of "delay"
cannot be done if QEMU starts suggesting it. Since it's the
only case I opted for this very much ad-hoc patch.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This enables some simplification of vl.c via error_fatal, and improves
error messages. Before:
$ ./qemu-system-x86_64 -readconfig .
qemu-system-x86_64: error reading file
qemu-system-x86_64: -readconfig .: read config .: Invalid argument
$ /usr/libexec/qemu-kvm -readconfig foo
qemu-kvm: -readconfig foo: read config foo: No such file or directory
After:
$ ./qemu-system-x86_64 -readconfig .
qemu-system-x86_64: -readconfig .: Cannot read config file: Is a directory
$ ./qemu-system-x86_64 -readconfig foo
qemu-system-x86_64: -readconfig foo: Could not open 'foo': No such file or directory
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210226170816.231173-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Otherwise the call to event_notifier_set() is a nop, which causes
the SLOF firmware on POWER to hang when booting from a virtio-scsi
device:
virtio_scsi_dataplane_start()
virtio_scsi_vring_init()
virtio_bus_set_host_notifier() <- assign == true
event_notifier_init() <- active == 1
event_notifier_set() <- fails right away if !e->initialized
Fixes: e34e47eb28 ("event_notifier: handle initialization failure better")
Cc: mlevitsk@redhat.com
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <20210216120247.1293569-1-groug@kaod.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When looking for the next directory component, a "." component is now skipped.
This fixes the path(s) used for firmware lookup for the prefix == bindir case
which is standard for QEMU on Windows and where the internally
used bindir value ends with "/.".
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20210208205752.2488774-1-sw@weilnetz.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow RAM MemoryRegion to be created from an offset in a file, instead
of allocating at offset of 0 by default. This is needed to synchronize
RAM between QEMU & remote process.
Signed-off-by: Jagannathan Raman <jag.raman@oracle.com>
Signed-off-by: John G Johnson <john.g.johnson@oracle.com>
Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 609996697ad8617e3b01df38accc5c208c24d74e.1611938319.git.jag.raman@oracle.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add 'initialized' field and use it to avoid touching event notifiers which are
either not initialized or if their initialization failed.
This is somewhat a hack, but it seems the less intrusive way to make
virtio code deal with event notifiers that failed initialization.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20201217150040.906961-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Glue code to the userfaultfd kernel implementation.
Querying feature support, createing file descriptor, feature control,
memory region registration, IOCTLs on registered registered regions.
Signed-off-by: Andrey Gruzdev <andrey.gruzdev@virtuozzo.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20210129101407.103458-3-andrey.gruzdev@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixed up range.start casting for 32bit
Developer errors are better represented with assert() rather than abort(). Also
improve the strictness of the checks by using range checks within the assert()
rather than converting the existing equality checks to inequality checks.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210121102518.20112-1-mark.cave-ayland@ilande.co.uk>
Actually, we can't extend the io vector in all cases. Handle possible
MAX_IOV and size_t overflows.
For now add assertion to callers (actually they rely on success anyway)
and fix them in the following patch.
Add also some additional good assertions to qemu_iovec_init_slice()
while being here.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20201211183934.169161-3-vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
There is currently no way to open(O_RDONLY) and mmap(PROT_READ) when
creating a memory region from a file. This functionality is needed since
the underlying host file may not allow writing.
Add a bool readonly argument to memory_region_init_ram_from_file() and
the APIs it calls.
Extend memory_region_init_ram_from_file() rather than introducing a
memory_region_init_rom_from_file() API so that callers can easily make a
choice between read/write and read-only at runtime without calling
different APIs.
No new RAMBlock flag is introduced for read-only because it's unclear
whether RAMBlocks need to know that they are read-only. Pass a bool
readonly argument instead.
Both of these design decisions can be changed in the future. It just
seemed like the simplest approach to me.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20210104171320.575838-2-stefanha@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The -msg timestamp=on|off option controls whether a timestamp is printed
with error_report() messages. The "-msg" name suggests that this option
has a wider effect than just error_report(). The next patch extends it
to the 'log' trace backend, so rename the variable from
error_with_timestamp to message_with_timestamp.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210125113507.224287-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Disposition (action) for any given signal is global for the process.
When two threads run coroutine-sigaltstack's qemu_coroutine_new()
concurrently, they may interfere with each other: One of them may revert
the SIGUSR2 handler to SIG_DFL, between the other thread (a) setting up
coroutine_trampoline() as the handler and (b) raising SIGUSR2. That
SIGUSR2 will then terminate the QEMU process abnormally.
We have to ensure that only one thread at a time can modify the
process-global SIGUSR2 handler. To do so, wrap the whole section where
that is done in a mutex.
Alternatively, we could for example have the SIGUSR2 handler always be
coroutine_trampoline(), so there would be no need to invoke sigaction()
in qemu_coroutine_new(). Laszlo has posted a patch to do so here:
https://lists.nongnu.org/archive/html/qemu-devel/2021-01/msg05962.html
However, given that coroutine-sigaltstack is more of a fallback
implementation for platforms that do not support ucontext, that change
may be a bit too invasive to be comfortable with it. The mutex proposed
here may negatively impact performance, but the change is much simpler.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20210125120305.19520-1-mreitz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
- Various improvements for SD cards in SPI mode (Bin Meng)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmANx6cACgkQ4+MsLN6t
wN5OTRAAvBZrBn3YktziZZds4HpKpBdEC/lAmlYNBCl6cn6gpAfrYz1WjpKm+DrA
0tfDeanoqUnWNReYwFRyQHzpWtIjIGo1K5tLbBVGE3qL1DtoZliDMA94RAGZu9UW
vrdWFxFtRFJ6Yqs0JjIhY2c+K9y7UcYRqATihbl/TpQNLSlVKblKnP1GPKZWqpRx
RL+sdAzwXhtXLzaJ/Jnk4XDTibNsLsRMWsa0rKM2o6181NqumYDj6gWOFfZWADji
lScwZzU0gWxYEarruUWaMMDxxB/1OXGH5Rd+bpDTrqVJV9qgsEEVj1VrJVfCPQFk
nInd0X4cAp+Mq4x901eovWcF+nT/zNWS/vJ0JiJKlxciz3Oev0kJLPJ/3YssLK3k
LYrhb20Py5ug41XYnpOKLcXR8CBKyqRlmwp8U330lCooLDxhy2hXaU41B0Dte3M3
CgngnOKmr2xizdWKy8L9GFvcQIPv1w9tRIOm/Z3CaU4JNaDSZo8vSUMFAtzsiW7B
dB6TOXcYxQZEPt1u6dO5KUDetd7m2pRMQ+or5lZa3d5w57kpAzuLRyiyXWv1npQc
4nVf6fS/tqmkqOjZkyj3lliAKdDkmEfWoiSRvUjHeddetGFd5VJ8IjPdf6pDGj1G
H1ix3N1JIrGpBmKwVrjTbxeGBGuD4vhetQeMQ498exmzaiYEmgk=
=B3sY
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/philmd-gitlab/tags/sdmmc-20210124' into staging
SD/MMC patches
- Various improvements for SD cards in SPI mode (Bin Meng)
# gpg: Signature made Sun 24 Jan 2021 19:16:55 GMT
# gpg: using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD 6BB2 E3E3 2C2C DEAD C0DE
* remotes/philmd-gitlab/tags/sdmmc-20210124:
hw/sd: sd.h: Cosmetic change of using spaces
hw/sd: ssi-sd: Use macros for the dummy value and tokens in the transfer
hw/sd: ssi-sd: Fix the wrong command index for STOP_TRANSMISSION
hw/sd: ssi-sd: Add a state representing Nac
hw/sd: ssi-sd: Suffix a data block with CRC16
util: Add CRC16 (CCITT) calculation routines
hw/sd: sd: Drop sd_crc16()
hw/sd: sd: Support CMD59 for SPI mode
hw/sd: ssi-sd: Fix incorrect card response sequence
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Import CRC16 calculation routines from Linux kernel v5.10:
include/linux/crc-ccitt.h
lib/crc-ccitt.c
to QEMU:
include/qemu/crc-ccitt.h
util/crc-ccitt.c
Signed-off-by: Bin Meng <bin.meng@windriver.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20210123104016.17485-7-bmeng.cn@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[PMD: Restrict compilation to system emulation]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Options such as "server" or "nowait", that are commonly found in -chardev,
are sugar for "server=on" and "wait=off". This is quite surprising and
also does not have any notion of typing attached. It is even possible to
do "-device e1000,noid" and get a device with "id=off".
Deprecate it and print a warning when it is encountered. In general,
this short form for boolean options only seems to be in wide use for
-chardev and -spice.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Right now, help options are parsed normally and then checked
specially in opt_validate, but only if coming from
qemu_opts_parse_noisily. has_help_option does the check on its own.
opt_validate() has two callers: qemu_opt_set(), which passes null and is
therefore unaffected, and opts_do_parse(), which is affected.
opts_do_parse() is called by qemu_opts_do_parse(), which passes null and
is therefore unaffected, and opts_parse().
opts_parse() is called by qemu_opts_parse() and qemu_opts_set_defaults(),
which pass null and are therefore unaffected, and
qemu_opts_parse_noisily().
Move the check from opt_validate to the parsing workhorse of QemuOpts,
get_opt_name_value. This will come in handy in the next patch, which
will raise a warning for "-object memory-backend-ram,share" ("flag" option
with no =on/=off part) but not for "-object memory-backend-ram,help".
As a result:
- opts_parse and opts_do_parse do not return an error anymore
when help is requested; qemu_opts_parse_noisily does not have
to work around that anymore.
- various crazy ways to request help are not recognized anymore:
- "help=..."
- "nohelp" (sugar for "help=off")
- "?=..."
- "no?" (sugar for "?=off")
- "help" would be recognized as help request even if there is a (foolishly
named) parameter "help". No such parameters exist, though.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Looking at all merge-lists QemuOptsList, here is how they access their
QemuOpts:
reopen_opts in qemu-io-cmds.c ("qemu-img reopen -o")
qemu_opts_find(&reopen_opts, NULL)
empty_opts in qemu-io.c ("qemu-io open -o")
qemu_opts_find(&empty_opts, NULL)
qemu_rtc_opts ("-rtc")
qemu_find_opts_singleton("rtc")
qemu_machine_opts ("-M")
qemu_find_opts_singleton("machine")
qemu_action_opts ("-name")
qemu_opts_foreach->process_runstate_actions
qemu_boot_opts ("-boot")
in hw/nvram/fw_cfg.c and hw/s390x/ipl.c:
QTAILQ_FIRST(&qemu_find_opts("bootopts")->head)
in softmmu/vl.c:
qemu_opts_find(qemu_find_opts("boot-opts"), NULL)
qemu_name_opts ("-name")
qemu_opts_foreach->parse_name
parse_name does not use id
qemu_mem_opts ("-m")
qemu_find_opts_singleton("memory")
qemu_icount_opts ("-icount")
qemu_opts_foreach->do_configure_icount
do_configure_icount->icount_configure
icount_configure does not use id
qemu_smp_opts ("-smp")
qemu_opts_find(qemu_find_opts("smp-opts"), NULL)
qemu_spice_opts ("-spice")
QTAILQ_FIRST(&qemu_spice_opts.head)
i.e. they don't need an id. Sometimes its presence is ignored
(e.g. when using qemu_opts_foreach), sometimes all the options
with the id are skipped, sometimes only the first option on the
command line is considered. -boot does two different things
depending on who's looking at the options.
With this patch we just forbid id on merge-lists QemuOptsLists; if the
command line still works, it has the same semantics as before.
qemu_opts_create's fail_if_exists parameter is now unnecessary:
- it is unused if id is NULL
- opts_parse only passes false if reached from qemu_opts_set_defaults,
in which case this patch enforces that id must be NULL
- other callers that can pass a non-NULL id always set it to true
Assert that it is true in the only case where "fail_if_exists" matters,
i.e. "id && !lists->merge_lists". This means that if an id is present,
duplicates are always forbidden, which was already the status quo.
Discounting the case that aborts as it's not user-controlled (it's
"just" a matter of inspecting qemu_opts_create callers), the paths
through qemu_opts_create can be summarized as:
- merge_lists = true: singleton opts with NULL id; non-NULL id fails
- merge_lists = false: always return new opts; non-NULL id fails if dup
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When compiling qemu-fuzz-i386 on aarch64 host, clang reported the following
error:
../util/cacheflush.c:38:44: error: value size does not match register size
specified by the constraint and modifier [-Werror,-Wasm-operand-widths]
asm volatile("mrs\t%0, ctr_el0" : "=r"(save_ctr_el0));
^
../util/cacheflush.c:38:24: note: use constraint modifier "w"
asm volatile("mrs\t%0, ctr_el0" : "=r"(save_ctr_el0));
^~
%w0
Modify the type of save_ctr_el0 to uint64_t to fix it.
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Gan Qixin <ganqixin@huawei.com>
Message-Id: <20210115075656.717957-1-ganqixin@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
These are part of Semihosting for AArch32 and AArch64 Release 2.0
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210107170717.2098982-8-keithp@keithp.com>
Message-Id: <20210108224256.2321-19-alex.bennee@linaro.org>
The yank feature allows to recover from hanging qemu by "yanking"
at various parts. Other qemu systems can register themselves and
multiple yank functions. Then all yank functions for selected
instances can be called by the 'yank' out-of-band qmp command.
Available instances can be queried by a 'query-yank' oob command.
Signed-off-by: Lukas Straub <lukasstraub2@web.de>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <69934ceacfd33a7dfe53db145ecc630ad39ee47c.1609167865.git.lukasstraub2@web.de>
Acked-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
This commit is the result of running the timer-del-timer-free.cocci
script on the whole source tree.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20201215154107.3255-4-peter.maydell@linaro.org
For darwin, the CTR_EL0 register is not accessible, but there
are system routines that we can use.
For other hosts, copy the single pointer implementation from
libgcc and modify it to support the double pointer interface
we require. This halves the number of cache operations required
when split-rwx is enabled.
Reviewed-by: Joelle van Dyne <j@getutm.app>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We are shortly going to have a split rw/rx jit buffer. Depending
on the host, we need to flush the dcache at the rw data pointer and
flush the icache at the rx code pointer.
For now, the two passed pointers are identical, so there is no
effective change in behaviour.
Reviewed-by: Joelle van Dyne <j@getutm.app>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
qemu_try_memalign() expects a power of 2 alignment:
- posix_memalign(3):
The address of the allocated memory will be a multiple of alignment,
which must be a power of two and a multiple of sizeof(void *).
- _aligned_malloc()
The alignment value, which must be an integer power of 2.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20201021173803.2619054-3-philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We do not need or want to be allocating page sized quanta.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20201018164836.1149452-1-richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Low-level fd users from QEMU use aio_set_fd_handler(), which handles
event registration with the main loop; qemu_fd_register() is only
needed together with the main loop's poll notifiers, of which SLIRP
is the only user.
This removes a dependency from oslib-win32.c to main-loop.c.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20201218135712.674094-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the 'cmdline' is the last entry in 'rs->history' array, there is
no need to put this entry to the end of the array, partly because it is
the last entry, and partly because the next operition will lead to array
index out of bounds.
Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Alex Chen <alex.chen@huawei.com>
Message-id: 20201203135043.117072-1-alex.chen@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This has been a tcg-specific function, but is also in use
by hardware accelerators via physmem.c. This can cause
link errors when tcg is disabled.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Joelle van Dyne <j@getutm.app>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20201214140314.18544-3-richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
LLVM/Clang, supports runtime checks for forward-edge Control-Flow
Integrity (CFI).
CFI on indirect function calls (cfi-icall) ensures that, in indirect
function calls, the function called is of the right signature for the
pointer type defined at compile time.
For this check to work, the code must always respect the function
signature when using function pointer, the function must be defined
at compile time, and be compiled with link-time optimization.
This rules out, for example, shared libraries that are dynamically loaded
(given that functions are not known at compile time), and code that is
dynamically generated at run-time.
This patch:
1) Introduces the CONFIG_CFI flag to support cfi in QEMU
2) Introduces a decorator to allow the definition of "sensitive"
functions, where a non-instrumented function may be called at runtime
through a pointer. The decorator will take care of disabling cfi-icall
checks on such functions, when cfi is enabled.
3) Marks functions currently in QEMU that exhibit such behavior,
in particular:
- The function in TCG that calls pre-compiled TBs
- The function in TCI that interprets instructions
- Functions in the plugin infrastructures that jump to callbacks
- Functions in util that directly call a signal handler
Signed-off-by: Daniele Buono <dbuono@linux.vnet.ibm.com>
Acked-by: Alex Bennée <alex.bennee@linaro.org
Message-Id: <20201204230615.2392-3-dbuono@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QString supports modifying its string, but it's quite limited: you can
only append. The remaining callers use it for building an initial
string, never for modifying it later.
Change keyval_parse_one() to do build the initial string with GString.
This is another step towards making QString immutable.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20201211171152.146877-19-armbru@redhat.com>
Anywhere we create a list of just one item or by prepending items
(typically because order doesn't matter), we can use
QAPI_LIST_PREPEND(). But places where we must keep the list in order
by appending remain open-coded until later patches.
Note that as a side effect, this also performs a cleanup of two minor
issues in qga/commands-posix.c: the old code was performing
new = g_malloc0(sizeof(*ret));
which 1) is confusing because you have to verify whether 'new' and
'ret' are variables with the same type, and 2) would conflict with C++
compilation (not an actual problem for this file, but makes
copy-and-paste harder).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20201113011340.463563-5-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
[Straightforward conflicts due to commit a8aa94b5f8 "qga: update
schema for guest-get-disks 'dependents' field" and commit a10b453a52
"target/mips: Move mips_cpu_add_definition() from helper.c to cpu.c"
resolved. Commit message tweaked.]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
A QemuOptsList can be of one of two kinds: either it is pre-validated, or
it accepts any key and validation happens somewhere else (typically in
a Visitor or against a list of QOM properties). opts_accepts_any
returns true if a QemuOpts instance was created from a QemuOptsList of
the latter kind, but there is no function to do the check on a QemuOptsList.
Since this property comes from the QemuOptsList and almost all callers of
opts_accepts_any use opts->list anyway, modify the function to accept
QemuOptsList.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use strcspn to find an equal or comma value, and pass the result directly
to get_opt_name to avoid another strchr.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu_opts_set is used to create default network backends and to
parse sugar options -kernel, -initrd, -append, -bios and -dtb.
These are very different uses:
I would *expect* a function named qemu_opts_set to set an option in a
merge-lists QemuOptsList, such as -kernel, and possibly to set an option
in a non-merge-lists QemuOptsList with non-NULL id, similar to -set.
However, it wouldn't *work* to use qemu_opts_set for the latter
because qemu_opts_set uses fail_if_exists==1. So, for non-merge-lists
QemuOptsList and non-NULL id, the semantics of qemu_opts_set (fail if the
(QemuOptsList, id) pair already exists) are debatable.
On the other hand, I would not expect qemu_opts_set to create a
non-merge-lists QemuOpts with a single option; which it does, though.
For this case of non-merge-lists QemuOptsList and NULL id, qemu_opts_set
hardly adds value over qemu_opts_parse. It does skip some parsing and
unescaping, but that's not needed when creating default network
backends.
So qemu_opts_set has warty behavior for non-merge-lists QemuOptsList
if id is non-NULL, and it's mostly pointless if id is NULL. My
solution to keeping the API as simple as possible is to limit
qemu_opts_set to merge-lists QemuOptsList. For them, it's useful (we
don't want comma-unescaping for -kernel) *and* has sane semantics.
Network backend creation is switched to qemu_opts_parse.
qemu_opts_set is now only used on merge-lists QemuOptsList... except
in the testcase, which is changed to use a merge-list QemuOptsList.
With this change we can also remove the id parameter. With the
parameter always NULL, we know that qemu_opts_create cannot fail
and can pass &error_abort to it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes, with the changes
to the following files manually reverted:
contrib/libvhost-user/libvhost-user-glib.h
contrib/libvhost-user/libvhost-user.c
contrib/libvhost-user/libvhost-user.h
contrib/plugins/hotblocks.c
contrib/plugins/hotpages.c
contrib/plugins/howvec.c
contrib/plugins/lockstep.c
linux-user/mips64/cpu_loop.c
linux-user/mips64/signal.c
linux-user/sparc64/cpu_loop.c
linux-user/sparc64/signal.c
linux-user/x86_64/cpu_loop.c
linux-user/x86_64/signal.c
target/s390x/gen-features.c
tests/fp/platform.h
tests/migration/s390x/a-b-bios.c
tests/plugin/bb.c
tests/plugin/empty.c
tests/plugin/insn.c
tests/plugin/mem.c
tests/test-rcu-simpleq.c
tests/test-rcu-slist.c
tests/test-rcu-tailq.c
tests/uefi-test-tools/UefiTestToolsPkg/BiosTablesTest/BiosTablesTest.c
contrib/plugins/, tests/plugin/, and tests/test-rcu-slist.c appear not
to include osdep.h intentionally. The remaining reverts are the same
as in commit bbfff19688.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20201113061216.2483385-1-armbru@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Fix Coverity CID 1435957: Memory - illegal accesses (OVERRUN):
>>> Overrunning array "suffixes" of 7 8-byte elements at element
index 7 (byte offset 63) using index "idx" (which evaluates to 7).
Note, the biggest input value freq_to_str() can accept is UINT64_MAX,
which is ~18.446 EHz, less than 1000 EHz.
Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Luc Michel <luc@lmichel.fr>
Message-id: 20201101215755.2021421-1-f4bug@amsat.org
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently, when using "nvme://" for a block device, like
-drive file=nvme://0000:01:00.0/1,if=none,id=drive0 \
-device virtio-blk,drive=drive0 \
VFIO may pin all guest memory, and discarding of RAM no longer works as
expected. I was able to reproduce this easily with my
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd
NVMe SSD Controller SM981/PM981/PM983
Similar to common VFIO, we have to disable it, making sure that:
a) virtio-balloon won't discard any memory ("silently disabled")
b) virtio-mem and nvme:// run mutually exclusive
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Wei Yang <richardw.yang@linux.intel.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20201116105947.9194-1-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is no "version 2" of the "Lesser" General Public License.
It is either "GPL version 2.0" or "Lesser GPL version 2.1".
This patch replaces all occurrences of "Lesser GPL version 2" with
"Lesser GPL version 2.1" in comment section.
This patch contains all the files, whose maintainer I could not get
from ‘get_maintainer.pl’ script.
Signed-off-by: Chetan Pant <chetan4windows@gmail.com>
Message-Id: <20201023124424.20177-1-chetan4windows@gmail.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Adapted exec.c and qdev-monitor.c to new location]
Signed-off-by: Thomas Huth <thuth@redhat.com>
There is no "version 2" of the "Lesser" General Public License.
It is either "GPL version 2.0" or "Lesser GPL version 2.1".
This patch replaces all occurrences of "Lesser GPL version 2" with
"Lesser GPL version 2.1" in comment section.
Signed-off-by: Chetan Pant <chetan4windows@gmail.com>
Message-Id: <20201023123624.19891-1-chetan4windows@gmail.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Lots of fixes all over the place.
virtio-mem and virtio-iommu patches are kind of fixes but
it seems better to just make them behave sanely than
try to educate users about the limitations ...
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAl+i9YMPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpySQH/Ru/sxB9PncR1HsqSf0HC0tt/EMKgyZTXEwQ
FITcjkCvBDS98a1VUvvZbjzTEDEZNnoUv94MjdLeBoptJ7GtK6nPoI6Ke0p1Zqbe
mlY2BCb0FpN8FE+mthjAI03mhw6o8Qo/OPtyISQzUxCVVqUHL5TRAVAQdeidoK8n
RBQ4WogwM/h7wI0d9GGgSxAON8IRQnBYImtzJieBb6zeScwKVFTWI1tqBdOyFN0/
AhzQiNZuhZ7a1XGJIsxmWB1NK2kcXNJuOF0ANh4coIHR0JzmH3xRy+Jnf5e3dYsw
LI23DUZPSTJJXAwKPucyTG7RTX8F55N9DVHC9KDRD6Ntq1oreJ4=
=pcbN
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc,pci,vhost,virtio: fixes
Lots of fixes all over the place.
virtio-mem and virtio-iommu patches are kind of fixes but
it seems better to just make them behave sanely than
try to educate users about the limitations ...
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Wed 04 Nov 2020 18:40:03 GMT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (31 commits)
contrib/vhost-user-blk: fix get_config() information leak
block/export: fix vhost-user-blk get_config() information leak
block/export: make vhost-user-blk config space little-endian
configure: introduce --enable-vhost-user-blk-server
libvhost-user: follow QEMU comment style
vhost-blk: set features before setting inflight feature
Revert "vhost-blk: set features before setting inflight feature"
net: Add vhost-vdpa in show_netdevs()
vhost-vdpa: Add qemu_close in vhost_vdpa_cleanup
vfio: Don't issue full 2^64 unmap
virtio-iommu: Set supported page size mask
vfio: Set IOMMU page size as per host supported page size
memory: Add interface to set iommu page size mask
virtio-iommu: Add notify_flag_changed() memory region callback
virtio-iommu: Add replay() memory region callback
virtio-iommu: Call memory notifiers in attach/detach
virtio-iommu: Add memory notifiers for map/unmap
virtio-iommu: Store memory region in endpoint struct
virtio-iommu: Fix virtio_iommu_mr()
hw/smbios: Fix leaked fd in save_opt_one() error path
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
OptsVisitor, StringInputVisitor and the keyval visitor have
three different ideas of how a human could write the value of
a boolean option. Pay homage to the backwards-compatibility
gods and make the new common helper accept all four sets (on/off,
true/false, y/n and yes/no), but remove case-insensitivity.
Since OptsVisitor is supposed to match qemu-options, adjust
it as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20201103161339.447118-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The assert() was added in commit b681a1c73e ("block: Repair the
throttling code."), when the qemu_co_queue_do_restart() function
required to be running in a coroutine. It was later made unnecessary in
commit a9d9235567 ("coroutine-lock: reschedule coroutine on the
AioContext it was running on").
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20201027133602.3038018-2-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Make it possible to compile out the vhost-user-blk server. It is enabled
by default on Linux.
Note that vhost-user-server.c depends on libvhost-user, which requires
CONFIG_LINUX. The CONFIG_VHOST_USER dependency was erroneous since that
option controls vhost-user frontends (previously known as "master") and
not device backends (previously known as "slave").
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20201027173528.213464-3-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
mmap(2) states:
'offset' must be a multiple of the page size as returned
by sysconf(_SC_PAGE_SIZE).
Add an assertion to be sure we don't break this contract.
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20201103020733.2303148-8-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
The QEMU_VFIO_DEBUG definition is only modifiable at build-time.
Trace events can be enabled at run-time. As we prefer the latter,
convert qemu_vfio_dump_mappings() to use trace events instead
of fprintf().
Reviewed-by: Fam Zheng <fam@euphon.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20201103020733.2303148-7-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
For debugging purpose, trace where DMA regions are mapped.
Reviewed-by: Fam Zheng <fam@euphon.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20201103020733.2303148-6-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
For debugging purpose, trace where a BAR is mapped.
Reviewed-by: Fam Zheng <fam@euphon.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20201103020733.2303148-5-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
For debug purpose, trace BAR regions info.
Reviewed-by: Fam Zheng <fam@euphon.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20201103020733.2303148-4-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
We sometime get kernel panic with some devices on Aarch64
hosts. Alex Williamson suggests it might be broken PCIe
root complex. Add trace event to record the latest I/O
access before crashing. In case, assert our accesses are
aligned.
Reviewed-by: Fam Zheng <fam@euphon.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20201103020733.2303148-3-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Change the confuse "VFIO IOMMU check failed" error message by
the explicit "VFIO IOMMU Type1 is not supported" once.
Example on POWER:
$ qemu-system-ppc64 -drive if=none,id=nvme0,file=nvme://0001:01:00.0/1,format=raw
qemu-system-ppc64: -drive if=none,id=nvme0,file=nvme://0001:01:00.0/1,format=raw: VFIO IOMMU Type1 is not supported
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Fam Zheng <fam@euphon.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20201103020733.2303148-2-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Commit 9ce44e2ce2 "qmp: Move dispatcher to a coroutine" modified
aio_poll() in util/aio-posix.c to avoid an assertion failure. This
change is missing in util/aio-win32.c.
Apply the changes to util/aio-posix.c to util/aio-win32.c too.
This fixes an assertion failure on Windows whenever QEMU exits.
$ ./qemu-system-x86_64.exe -machine pc,accel=tcg -display gtk
**
ERROR:../qemu/util/aio-win32.c:337:aio_poll: assertion failed:
(in_aio_context_home_thread(ctx))
Bail out! ERROR:../qemu/util/aio-win32.c:337:aio_poll: assertion
failed: (in_aio_context_home_thread(ctx))
Fixes: 9ce44e2ce2 ("qmp: Move dispatcher to a coroutine")
Signed-off-by: Volker Rümelin <vr_qemu@t-online.de>
Message-Id: <20201021064033.8600-1-vr_qemu@t-online.de>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Memory returned by get_relocated_path must be freed with
free or g_free depending on the path that the function
took; Coverity takes exception to this practice. The
fix lets caller use g_free as is standard in QEMU.
While at it, mention the requirements on the caller in
the doc comment.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The abstract socket namespace is a non-portable Linux extension. An
attempt to use it elsewhere should fail with ENOENT (the abstract
address looks like a "" pathname, which does not resolve). We report
this failure like
Failed to connect socket abc: No such file or directory
Tolerable, although ENOTSUP would be better.
However, introspection lies: it has @abstract regardless of host
support. Easy enough to fix: since Linux provides them since 2.2,
'if': 'defined(CONFIG_LINUX)' should do.
The above failure becomes
Parameter 'backend.data.addr.data.abstract' is unexpected
I consider this an improvement.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
unix_listen_saddr() replaces empty @path by unique value. It obtains
the value by creating and deleting a unique temporary file with
mkstemp(). This is racy, as the comment explains. It's also entirely
undocumented as far as I can tell. Goes back to commit d247d25f18
"sockets: helper functions for qemu (Gerd Hoffman)", v0.10.0.
Since abstract socket addresses have no connection with filesystem
pathnames, making them up with mkstemp() seems inappropriate. Bypass
the replacement of empty @path.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Commit 776b97d360 "qemu-sockets: add abstract UNIX domain socket
support" neglected to update socket_sockaddr_to_address_unix(). The
function returns a non-abstract socket address for abstract
sockets (wrong) with a null @path (also wrong; a non-optional QAPI str
member must never be null).
The null @path is due to confused code going back all the way to
commit 17c55decec "sockets: add helpers for creating SocketAddress
from a socket".
Add the required special case, and simplify the confused code.
Fixes: 776b97d360
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
An optional bool member of a QAPI struct can be false, true, or absent.
The previous commit demonstrated that socket_listen() and
socket_connect() are broken for absent @tight, and indeed QMP chardev-
add also defaults absent member @tight to false instead of true.
In C, QAPI members are represented by two fields, has_MEMBER and MEMBER.
We have:
has_MEMBER MEMBER
false true false
true true true
absent false false/ignore
When has_MEMBER is false, MEMBER should be set to false on write, and
ignored on read.
For QMP, the QAPI visitors handle absent @tight by setting both
@has_tight and @tight to false. unix_listen_saddr() and
unix_connect_saddr() however use @tight only, disregarding @has_tight.
This is wrong and means that absent @tight defaults to false whereas it
should default to true.
The same is true for @has_abstract, though @abstract defaults to
false and therefore has the same behavior for all of QMP, HMP and CLI.
Fix unix_listen_saddr() and unix_connect_saddr() to check
@has_abstract/@has_tight, and to default absent @tight to true.
However, this is only half of the story. HMP chardev-add and CLI
-chardev so far correctly defaulted @tight to true, but defaults to
false again with the above fix for HMP and CLI. In fact, the "tight"
and "abstract" options now break completely.
Digging deeper, we find that qemu_chr_parse_socket() also ignores
@has_tight, leaving it false when it sets @tight. That is also wrong,
but the two wrongs cancelled out. Fix qemu_chr_parse_socket() to set
@has_tight and @has_abstract; writing testcases for HMP and CLI is left
for another day.
Fixes: 776b97d360
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reporting "Failed to connect socket" is essentially useless for a user
attempting to diagnose failure. It needs to include the target address
details. Similarly when failing to create a socket we should include the
socket family info, so the user understands what particular feature was
missing in their kernel build (IPv6, VSock in particular).
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We want missing symbols fail module load right away instead of having
qemu abort later on in case lazy binding fails. Can happen -- for
example -- when trying to load a module for a pci device
(virtio-gpu-pci) into a qemu without pci support (qemu-system-avr).
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20201028054944.5772-1-kraxel@redhat.com