engaged_in_io could be unset by an MR with re-entrancy checks disabled.
Ensure that only MRs that can set the engaged_in_io flag can unset it.
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20230516084002.3813836-1-alxndr@bu.edu>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Biosbits avocado tests on gitlab has thus far been disabled because some
packages needed by this test was missing in the container images used by gitlab
CI. These packages have now been added with the commit:
da9000784c ("tests/lcitool: Add mtools and xorriso and remove genisoimage as dependencies")
Therefore, this change enables bits avocado test on gitlab.
At the same time, the bits cleanup code has also been made more robust with
this change.
Signed-off-by: Ani Sinha <anisinha@redhat.com>
Message-Id: <20230517065357.5614-1-anisinha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
qmp-intro.txt is quite small and provides very little information
that isn't already in the documentation elsewhere. Fold the example
command lines into qemu-options.hx, and delete the now-unneeded plain
text document.
While we're touching the qemu-options.hx documentation text,
wordsmith it a little bit and improve the rST formatting.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230515162245.3964307-4-peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The description text for a parsing error has changed since the
spec doc was first written; update the example in the docs.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230515162245.3964307-3-peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Convert the qmp-spec.txt document to restructuredText.
Notable points about the conversion:
* numbers at the start of section headings are removed, to match
the style of the rest of the manual
* cross-references to other sections or documents are hyperlinked
* various formatting tweaks (notably the examples, which need the
-> and <- prefixed so the QMP code-block lexer will accept them)
* English prose fixed in a few places
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230515162245.3964307-2-peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[.. code-block:: dumbed down to :: to work around CI failure]
The error message is bad when the section is untagged. For instance,
test case doc-interleaved-section produces "'@foobar:' can't follow
'Note' section", which is okay, but if we drop the "Note:" tag, we get
"'@foobar:' can't follow 'None' section, which is bad.
Change the error message to "description of '@foobar:' follows a
section".
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230510141637.3685080-1-armbru@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[Conflict with commit 3e32dca3f0 resolved]
Thanks to the fixes from the previous patches, we can now run
the full set of "make check" with all targets here.
Message-Id: <20230512124033.502654-19-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This way QEMU won't complain in case the VGA card or the NIC device
are not available in the binary, thus it won't spoil the output
and the test then passes with such QEMU binaries that have a limited
configuration, too.
Message-Id: <20230512124033.502654-18-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
These tests rely on a default NIC to be available. Skip them if we
used the "--without-default-devices" configure option.
Message-Id: <20230512124033.502654-17-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The devices might not have been compiled into the QEMU binary, so we
have to check before we can use them.
Message-Id: <20230512124033.502654-16-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
virtio-balloon-ccw is already tested in the device-plug-test,
virtio-blk-ccw is already tested in cdrom-test, and virtio-net-ccw
is already tested in the pxe-test, so there is not much point
in doing "nop" tests here again.
Message-Id: <20230512124033.502654-15-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
It's possible to disable virtio-scsi and virtio-blk in the binaries,
so we must not run the corresponding tests if these devices are missing.
Message-Id: <20230512124033.502654-14-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The test is already fenced with CONFIG_USB_UHCI in meson.build, but in
case we build the ppc or mips targets in parallel, this config switch
is still set in "config_all_devices" and thus the test is still run.
Thus we need an explicit additional check here before adding the tests
to the test plan.
Message-Id: <20230512124033.502654-13-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The USB controllers might not be available in the QEMU binary
(e.g. when using the "--without-default-devices" configure switch),
so we have to check whether the devices can be used before running
the related test.
Message-Id: <20230512124033.502654-12-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Announce the default NIC via MachineClass->default_nic and set up
MachineClass->no_parallel according to the availability of the
"isa-parallel" device, so that the Sun machines also work when
QEMU has been configured with "--without-default-devices".
Message-Id: <20230512124033.502654-11-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Don't try to instantiate the parallel port if it has not been
enabled in the build configuration.
Message-Id: <20230512124033.502654-10-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We are going to require the macro from other files, too, so move
this #define to the header file.
Message-Id: <20230512124033.502654-9-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mark the default NIC via the new MachineClass->default_nic setting
so that the machine-defaults code in vl.c can decide whether the
default NIC is usable or not (for example when compiling with the
"--without-default-devices" configure switch).
Message-Id: <20230512124033.502654-8-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mark the default NIC via the new MachineClass->default_nic setting
so that the machine-defaults code in vl.c can decide whether the
default NIC is usable or not (for example when compiling with the
"--without-default-devices" configure switch).
Message-Id: <20230512124033.502654-7-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Mark the default NIC via the new MachineClass->default_nic setting
so that the machine-defaults code in vl.c can decide whether the
default NIC is usable or not (for example when compiling with the
"--without-default-devices" configure switch).
Message-Id: <20230512124033.502654-6-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Don't try to instantiate a default NIC if it is not available (since
this will cause QEMU to abort). Emit a warning instead.
Message-Id: <20230512124033.502654-5-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We are going to re-use this setting for other targets, so let's
move this to the main MachineClass.
Message-Id: <20230512124033.502654-4-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
In case the user disabled the default VGA device in the binary (e.g.
with the "--without-default-devices" configure switch), we should
not try to use it by default if QEMU is running with the default
devices, otherwise it aborts when trying to use it. Simply emit a
warning instead.
Message-Id: <20230512124033.502654-3-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The "isapc" machine can also be run without VGA card, so there
is no need for a hard requirement with a "select" here - "imply"
is enough.
Message-Id: <20230512124033.502654-2-thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
CXL volatile memory support
More memslots for vhost-user on x86 and ARM.
vIOMMU support for vhost-vdpa
pcie-to-pci bridge can now be compiled out
MADT revision bumped to 3
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmRniWoPHG1zdEByZWRo
YXQuY29tAAoJECgfDbjSjVRpN4MH/RqdvHmujrjvjzXbbN/gq87Njp+kQLKEooIE
ZkqdNaVUE6vjCH8iU+chjsxt4VSquSjOL9CWWrYefEIeqCFLWsuXSAY0VDAbY67x
+aes51tTYILVsx7fbb+T5mJKRgVuWW4C5KaGeQ1djSexy42nvplZUJdIJUhZr0t9
dzzOsD+mezHS7Xu2QOzSfl5QQRuOVVJnjJXkqJG/yRvHrZM5aTolatr/X7jNGedm
4oyMsVMaAcQ+dnEQigRJodf/MpFfs9DfNZAH55VwwQWsNT0t0ueD0xigR203jjaE
mJJJipAqetFax2JjC7QMXWf+LR36BnL/0/xH+x/BWb0FI42wr0I=
=ajmR
-----END PGP SIGNATURE-----
Merge tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu into staging
virtio,pc,pci: fixes, features, cleanups
CXL volatile memory support
More memslots for vhost-user on x86 and ARM.
vIOMMU support for vhost-vdpa
pcie-to-pci bridge can now be compiled out
MADT revision bumped to 3
Fixes, cleanups all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmRniWoPHG1zdEByZWRo
# YXQuY29tAAoJECgfDbjSjVRpN4MH/RqdvHmujrjvjzXbbN/gq87Njp+kQLKEooIE
# ZkqdNaVUE6vjCH8iU+chjsxt4VSquSjOL9CWWrYefEIeqCFLWsuXSAY0VDAbY67x
# +aes51tTYILVsx7fbb+T5mJKRgVuWW4C5KaGeQ1djSexy42nvplZUJdIJUhZr0t9
# dzzOsD+mezHS7Xu2QOzSfl5QQRuOVVJnjJXkqJG/yRvHrZM5aTolatr/X7jNGedm
# 4oyMsVMaAcQ+dnEQigRJodf/MpFfs9DfNZAH55VwwQWsNT0t0ueD0xigR203jjaE
# mJJJipAqetFax2JjC7QMXWf+LR36BnL/0/xH+x/BWb0FI42wr0I=
# =ajmR
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 19 May 2023 07:36:26 AM PDT
# gpg: using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg: issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [undefined]
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>" [undefined]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* tag 'for_upstream' of https://git.kernel.org/pub/scm/virt/kvm/mst/qemu: (40 commits)
hw/i386/pc: No need for rtc_state to be an out-parameter
hw/i386/pc: Create RTC controllers in south bridges
hw/cxl: Introduce cxl_device_get_timestamp() utility function
hw/cxl: rename mailbox return code type from ret_code to CXLRetCode
hw/pci-bridge: make building pcie-to-pci bridge configurable
virtio-pci: add handling of PCI ATS and Device-TLB enable/disable
hw/pci-host/pam: Make init_pam() usage more readable
hw/i386/pc: Initialize ram_memory variable directly
hw/i386/pc_{q35,piix}: Minimize usage of get_system_memory()
hw/i386/pc_{q35,piix}: Reuse MachineClass::desc as SMB product name
hw/i386/pc_q35: Reuse machine parameter
hw/pci-host/q35: Inline sysbus_add_io()
hw/pci-host/i440fx: Inline sysbus_add_io()
vhost-vdpa: Add support for vIOMMU.
vhost-vdpa: Add check for full 64-bit in region delete
vhost_vdpa: fix the input in trace_vhost_vdpa_listener_region_del()
vhost: expose function vhost_dev_has_iommu()
virtio-crypto: fix NULL pointer dereference in virtio_crypto_free_request
virtio-net: not enable vq reset feature unconditionally
vhost-user: Remove acpi-specific memslot limit
...
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When running on the Kubernetes runner, this CI job is timing out.
Raise the limit to give the job enough time to run.
Signed-off-by: Camilla Conte <cconte@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230407145252.32955-2-cconte@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Configure Gitlab CI to run on Kubernetes
according to the official documentation.
https://docs.gitlab.com/ee/ci/docker/using_docker_build.html#docker-in-docker-with-tls-enabled-in-kubernetes
These changes are needed because of the CI jobs
using Docker-in-Docker (dind).
As soon as Docker-in-Docker is replaced with Kaniko,
these changes can be reverted.
I documented what I did to set up the Kubernetes runner on the wiki:
https://wiki.qemu.org/Testing/CI/KubernetesRunners
Signed-off-by: Camilla Conte <cconte@redhat.com>
Message-Id: <20230407145252.32955-1-cconte@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Some scripts are invoked via the first "python3" binary in the PATH,
because they are executable and their shebang line is "#! /usr/bin/env
python3". To enforce usage of $(PYTHON), make them nonexecutable.
Scripts invoked via meson need nothing else, and meson-buildoptions.py
is already using $(PYTHON). For probe-gdb-support.py however the
invocation in the configure script has to be adjusted.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since custom runners are not generally available, make it possible to
debug the differences between a successful and a failing build by
comparing the logs and the build.ninja rules.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If sphinx is present but the theme is not, mkvenv will print an
inaccurate diagnostic:
ERROR: Could not find a version that satisfies the requirement sphinx-rtd-theme>=0.5.0 (from versions: none)
ERROR: No matching distribution found for sphinx-rtd-theme>=0.5.0
'sphinx>=1.6.0' not found:
• Python package 'sphinx' version '5.3.0' was found, but isn't suitable.
• mkvenv was configured to operate offline and did not check PyPI.
Instead, ignore the packages that were found to be present, and report
an error based on the first absent package.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reintroduce the cmd_line.txt mangling to remove the sphinx_build option
when rerunning meson. The mechanism was removed in commit 75cc286485
("configure: remove backwards-compatibility code", 2023-01-11) because
the fixups were obsolete at the time; however, the Meson deprecation
mechanism doesn't quite work when options are finally removed, so we
need to bring it back.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not use the rule in build.ninja, because the path to meson is hardcoded
in build.ninja and this breaks if meson moves (for example if the distro
meson suddenly becomes too old after an update).
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
importlib.metadata is just as good as distlib.database and a bit more
battle-proven for "egg" based distributions, and in fact that is exactly
why mkvenv.py is not using distlib.database to find entry points: it
simply does not work for eggs.
The only disadvantage of importlib.metadata is that it is not available
by default before Python 3.8, so we need a fallback to pkg_resources
(again, just like for the case of finding entry points). Do so to
fix issues where incorrect egg metadata results in a JSONDecodeError.
While at it, reuse the new _get_version function to diagnose an incorrect
version of the package even if importlib.metadata is not available.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This tests exercises graph locking, draining, and graph modifications
with AioContext switches a lot. Amongst others, it serves as a
regression test for bdrv_graph_wrlock() deadlocking because it is called
with a locked AioContext and for AioContext handling in the NBD server.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-4-kwolf@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
nbd_drained_poll() generally runs in the main thread, not whatever
iothread the NBD server coroutine is meant to run in, so it can't
directly reenter the coroutines to wake them up.
The code seems to have the right intention, it specifies the correct
AioContext when it calls qemu_aio_coroutine_enter(). However, this
functions doesn't schedule the coroutine to run in that AioContext, but
it assumes it is already called in the home thread of the AioContext.
To fix this, add a new thread-safe qio_channel_wake_read() that can be
called in the main thread to wake up the coroutine in its AioContext,
and use this in nbd_drained_poll().
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-3-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In QEMU 8.0, we've been seeing deadlocks in bdrv_graph_wrlock(). They
come from callers that hold an AioContext lock, which is not allowed
during polling. In theory, we could temporarily release the lock, but
callers are inconsistent about whether they hold a lock, and if they do,
some are also confused about which one they hold. While all of this is
fixable, it's not trivial, and the best course of action for 8.0.1 is
probably just disabling the graph locking code temporarily.
We don't currently rely on graph locking yet. It is supposed to replace
the AioContext lock eventually to enable multiqueue support, but as long
as we still have the AioContext lock, it is sufficient without the graph
lock. Once the AioContext lock goes away, the deadlock doesn't exist any
more either and this commit can be reverted. (Of course, it can also be
reverted while the AioContext lock still exists if the callers have been
fixed.)
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230517152834.277483-2-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230502184134.534703-3-stefanha@redhat.com>
[kwolf: Restrict to CONFIG_POSIX, Windows doesn't support polling]
Tested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
QEMU's event loop supports nesting, which means that event handler
functions may themselves call aio_poll(). The condition that triggered a
handler must be reset before the nested aio_poll() call, otherwise the
same handler will be called and immediately re-enter aio_poll. This
leads to an infinite loop and stack exhaustion.
Poll handlers are especially prone to this issue, because they typically
reset their condition by finishing the processing of pending work.
Unfortunately it is during the processing of pending work that nested
aio_poll() calls typically occur and the condition has not yet been
reset.
Disable a poll handler during ->io_poll_ready() so that a nested
aio_poll() call cannot invoke ->io_poll_ready() again. As a result, the
disabled poll handler and its associated fd handler do not run during
the nested aio_poll(). Calling aio_set_fd_handler() from inside nested
aio_poll() could cause it to run again. If the fd handler is pending
inside nested aio_poll(), then it will also run again.
In theory fd handlers can be affected by the same issue, but they are
more likely to reset the condition before calling nested aio_poll().
This is a special case and it's somewhat complex, but I don't see a way
around it as long as nested aio_poll() is supported.
Cc: qemu-stable@nongnu.org
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=2186181
Fixes: c382706925 ("block: Mark bdrv_co_io_(un)plug() and callers GRAPH_RDLOCK")
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20230502184134.534703-2-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Skip TestBlockdevReopen.test_insert_compress_filter() if the 'compress'
driver isn't available.
In order to make the test succeed when the case is skipped, we also need
to remove any output from it (which would be missing in the case where
we skip it). This is done by replacing qemu_io_log() with qemu_io(). In
case of failure, qemu_io() raises an exception with the output of the
qemu-io binary in its message, so we don't actually lose anything.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230511143801.255021-1-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are some conditions under which we don't actually need to do
anything for taking a reader lock: Writing the graph is only possible
from the main context while holding the BQL. So if a reader is running
in the main context under the BQL and knows that it won't be interrupted
until the next writer runs, we don't actually need to do anything.
This is the case if the reader code neither has a nested event loop
(this is forbidden anyway while you hold the lock) nor is a coroutine
(because a writer could run when the coroutine has yielded).
These conditions are exactly what bdrv_graph_rdlock_main_loop() asserts.
They are not fulfilled in bdrv_graph_co_rdlock(), which always runs in a
coroutine.
This deletes the shortcuts in bdrv_graph_co_rdlock() that skip taking
the reader lock in the main thread.
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-9-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When jobs are sleeping, for example to enforce a given rate limit, they
can be reentered early, in particular in order to get paused, to update
the rate limit or to get cancelled.
Before this patch, they behave in this case as if they had fully
completed their rate limiting delay. This means that requests are sped
up beyond their limit, violating the constraints that the user gave us.
Change the block jobs to sleep in a loop until the necessary delay is
completed, while still allowing cancelling them immediately as well
pausing (handled by the pause point in job_sleep_ns()) and updating the
rate limit.
This change is also motivated by iotests cases being prone to fail
because drain operations pause and unpause them so often that block jobs
complete earlier than they are supposed to. In particular, the next
commit would fail iotests 030 without this change.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-8-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_unref() is a no_coroutine_fn, so calling it from coroutine context
is invalid. Use bdrv_co_unref() instead.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-7-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If we take a reader lock, we can't call any functions that take a writer
lock internally without causing deadlocks once the reader lock is
actually enforced in the main thread, too. Take the reader lock only
where it is actually needed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-6-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If we take a reader lock, we can't call any functions that take a writer
lock internally without causing deadlocks once the reader lock is
actually enforced in the main thread, too. Take the reader lock only
where it is actually needed.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-5-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qcow2_do_open() calls a few no_co_wrappers that wrap functions taking
the graph lock internally as a writer. Therefore, it can't hold the
reader lock across these calls, it causes deadlocks. Drop the lock
temporarily around the calls.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-4-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are some error paths in blk_exp_add() that jump to 'fail:' before
'exp' is even created. So we can't just unconditionally access exp->blk.
Add a NULL check, and switch from exp->blk to blk, which is available
earlier, just to be extra sure that we really cover all cases where
BlockDevOps could have been set for it (in practice, this only happens
in drv->create() today, so this part of the change isn't strictly
necessary).
Fixes: Coverity CID 1509238
Fixes: de79b52604
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-3-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These are functions that modify the graph, so they must be able to take
a writer lock. This is impossible if they already hold the reader lock.
If they need a reader lock for some of their operations, they should
take it internally.
Many of them go through blk_*(), which will always take the lock itself.
Direct calls of bdrv_*() need to take the reader lock. Note that while
locking for bdrv_co_*() calls is checked by TSA, this is not the case
for the mixed_coroutine_fns bdrv_*(). Holding the lock is still required
when they are called from coroutine context like here!
This effectively reverts 4ec8df0183, but adds some internal locking
instead.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20230510203601.418015-2-kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>