Commit Graph

1326 Commits

Author SHA1 Message Date
Peter Maydell ad768e6f2a include: Move qemu_[id]cache_* declarations to new qemu/cacheinfo.h
The qemu_icache_linesize, qemu_icache_linesize_log,
qemu_dcache_linesize, and qemu_dcache_linesize_log variables are not
used in many files.  Move them out of osdep.h to a new
qemu/cacheinfo.h, and document them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220208200856.3558249-5-peter.maydell@linaro.org
2022-02-21 13:30:20 +00:00
Peter Maydell 5b3e34315a include: Move QEMU_MAP_* constants to mmap-alloc.h
The QEMU_MAP_* constants are used only as arguments to the
qemu_ram_mmap() function.  Move them to mmap-alloc.h, where that
function's prototype is defined.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220208200856.3558249-4-peter.maydell@linaro.org
2022-02-21 13:30:20 +00:00
Peter Maydell f2241d16ea include: Move qemu_mprotect_*() to new qemu/mprotect.h
The qemu_mprotect_*() family of functions are used in very few files;
move them from osdep.h to a new qemu/mprotect.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220208200856.3558249-3-peter.maydell@linaro.org
2022-02-21 13:30:20 +00:00
Peter Maydell b85ea5fa2f include: Move qemu_madvise() and related #defines to new qemu/madvise.h
The function qemu_madvise() and the QEMU_MADV_* constants associated
with it are used in only 10 files.  Move them out of osdep.h to a new
qemu/madvise.h header that is included where it is needed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20220208200856.3558249-2-peter.maydell@linaro.org
2022-02-21 13:30:20 +00:00
Vitaly Chikunov e64e27d5cb 9pfs: Fix segfault in do_readdir_many caused by struct dirent overread
`struct dirent' returned from readdir(3) could be shorter (or longer)
than `sizeof(struct dirent)', thus memcpy of sizeof length will overread
into unallocated page causing SIGSEGV. Example stack trace:

 #0  0x00005555559ebeed v9fs_co_readdir_many (/usr/bin/qemu-system-x86_64 + 0x497eed)
 #1  0x00005555559ec2e9 v9fs_readdir (/usr/bin/qemu-system-x86_64 + 0x4982e9)
 #2  0x0000555555eb7983 coroutine_trampoline (/usr/bin/qemu-system-x86_64 + 0x963983)
 #3  0x00007ffff73e0be0 n/a (n/a + 0x0)

While fixing this, provide a helper for any future `struct dirent' cloning.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/841
Cc: qemu-stable@nongnu.org
Co-authored-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by: Vitaly Chikunov <vt@altlinux.org>
Tested-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Acked-by: Greg Kurz <groug@kaod.org>
Tested-by: Vitaly Chikunov <vt@altlinux.org>
Message-Id: <20220216181821.3481527-1-vt@altlinux.org>
[C.S. - Fix typo in source comment. ]
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
2022-02-17 16:57:58 +01:00
Hiroki Narukawa 4c41c69e05 util: adjust coroutine pool size to virtio block queue
Coroutine pool size was 64 from long ago, and the basis was organized in the commit message in 4d68e86b.

At that time, virtio-blk queue-size and num-queue were not configuable, and equivalent values were 128 and 1.

Coroutine pool size 64 was fine then.

Later queue-size and num-queue got configuable, and default values were increased.

Coroutine pool with size 64 exhausts frequently with random disk IO in new size, and slows down.

This commit adjusts coroutine pool size adaptively with new values.

This commit adds 64 by default, but now coroutine is not only for block devices,

and is not too much burdon comparing with new default.

pool size of 128 * vCPUs.

Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp>
Message-id: 20220214115302.13294-2-hnarukaw@yahoo-corp.jp
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-02-14 17:11:25 +00:00
Ivanov Arkady 91d4032710 plugins: add helper functions for coverage plugins
Which provide information about:
- start_code.
- end_code.
- entry.
- path to the executable binary.

Signed-off-by: Ivanov Arkady <arkadiy.ivanov@ispras.ru>
Message-Id: <163491883461.304355.8210754161847179432.stgit@pc-System-Product-Name>
[AJB: reword title, better descriptions, defaults, rm export, fix include]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20220204204335.1689602-22-alex.bennee@linaro.org>
2022-02-09 12:08:42 +00:00
Kevin Wolf 520d8b40e8 block/export: Fix vhost-user-blk shutdown with requests in flight
The vhost-user-blk export runs requests asynchronously in their own
coroutine. When the vhost connection goes away and we want to stop the
vhost-user server, we need to wait for these coroutines to stop before
we can unmap the shared memory. Otherwise, they would still access the
unmapped memory and crash.

This introduces a refcount to VuServer which is increased when spawning
a new request coroutine and decreased before the coroutine exits. The
memory is only unmapped when the refcount reaches zero.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20220125151435.48792-1-kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2022-02-01 13:49:15 +01:00
Frédéric Pétrot e9d07601f6 qemu/int128: addition of div/rem 128-bit operations
Addition of div and rem on 128-bit integers, using the 128/64->128 divu and
64x64->128 mulu in host-utils.
These operations will be used within div/rem helpers in the 128-bit riscv
target.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-id: 20220106210108.138226-4-frederic.petrot@univ-grenoble-alpes.fr
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2022-01-08 15:46:10 +10:00
David Hildenbrand a384bfa32e util/oslib-posix: Support MADV_POPULATE_WRITE for os_mem_prealloc()
Let's sense support and use it for preallocation. MADV_POPULATE_WRITE
does not require a SIGBUS handler, doesn't actually touch page content,
and avoids context switches; it is, therefore, faster and easier to handle
than our current approach.

While MADV_POPULATE_WRITE is, in general, faster than manual
prefaulting, and especially faster with 4k pages, there is still value in
prefaulting using multiple threads to speed up preallocation.

More details on MADV_POPULATE_WRITE can be found in the Linux commits
4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault
page tables") and eb2faa513c24 ("mm/madvise: report SIGBUS as -EFAULT for
MADV_POPULATE_(READ|WRITE)"), and in the man page proposal [1].

This resolves the TODO in do_touch_pages().

In the future, we might want to look into using fallocate(), eventually
combined with MADV_POPULATE_READ, when dealing with shared file/fd
mappings and not caring about memory bindings.

[1] https://lkml.kernel.org/r/20210816081922.5155-1-david@redhat.com

Reviewed-by: Pankaj Gupta <pankaj.gupta@ionos.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20211217134611.31172-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-01-07 05:19:55 -05:00
Marc-André Lureau 3e301c8d7e ui/dbus: add chardev backend & interface
Add a new chardev backend which allows D-Bus client to handle the
chardev stream & events.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2021-12-21 10:50:22 +04:00
Marc-André Lureau 4085b87ff0 option: add g_auto for QemuOpts
Used in the next commit.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2021-12-21 10:50:22 +04:00
Marc-André Lureau 99997823bb ui/dbus: add p2p=on/off option
Add an option to use direct connections instead of via the bus. Clients
are accepted with QMP add_client.

This allows to provide the D-Bus display without a bus. It also
simplifies the testing setup (some CI have issues to setup a D-Bus bus
in a container).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2021-12-21 10:50:22 +04:00
Marc-André Lureau 142ca628a7 ui: add a D-Bus display backend
The "dbus" display backend exports the QEMU consoles and other
UI-related interfaces over D-Bus.

By default, the connection is established on the session bus, but you
can specify a different bus with the "addr" option.

The backend takes the "org.qemu" service name, while still allowing
further instances to queue on the same name (so you can lookup all the
available instances too). It accepts any number of clients at this
point, although this is expected to evolve with options to restrict
clients, or only accept p2p via fd passing.

The interface is intentionally very close to the internal QEMU API,
and can be introspected or interacted with busctl/dfeet etc:

$ ./qemu-system-x86_64 -name MyVM -display dbus
$ busctl --user introspect org.qemu /org/qemu/Display1/Console_0

org.qemu.Display1.Console           interface -         -               -
.RegisterListener                   method    h         -               -
.SetUIInfo                          method    qqiiuu    -               -
.DeviceAddress                      property  s         "pci/0000/01.0" emits-change
.Head                               property  u         0               emits-change
.Height                             property  u         480             emits-change
.Label                              property  s         "VGA"           emits-change
.Type                               property  s         "Graphic"       emits-change
.Width                              property  u         640             emits-change
[...]

See the interfaces XML source file and Sphinx docs for the generated API
documentations.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
2021-12-21 10:50:22 +04:00
Hanna Reitz 079bff693b transactions: Invoke clean() after everything else
Invoke the transaction drivers' .clean() methods only after all
.commit() or .abort() handlers are done.

This makes it easier to have nested transactions where the top-level
transactions pass objects to lower transactions that the latter can
still use throughout their commit/abort phases, while the top-level
transaction keeps a reference that is released in its .clean() method.

(Before this commit, that is also possible, but the top-level
transaction would need to take care to invoke tran_add() before the
lower-level transaction does.  This commit makes the ordering
irrelevant, which is just a bit nicer.)

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Message-Id: <20211111120829.81329-8-hreitz@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211115145409.176785-8-kwolf@redhat.com>
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
2021-11-16 09:43:44 +01:00
Richard Henderson 1b9fc6d8ba * Fixes for SGX
* force_rcu notifiers
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmGMQFwUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPDcAf8CBwO73zxJd0Z3eHzgiSlDavb+ORy
 vXkgGNMHBavaO+QuKzWrzuz42+6r+BW3mlMLEWGyEUGq7ZLbAwTGQ+zT+La8J+TG
 xK872G8skl1j9Xb1TL7t/DeT9ja4MlZbB0LehFa/GIgh2V6mFjXTzH05PH5p9hd0
 M8JGiLtrPEcIv4Df+T3pxbuQy45FqD4hLtEZJW4mKUm2oywxwHOLFty5+VVRxw5h
 Rl5Xuf5UfhAdmmBIyIjhVcVGJf+I2Fg7M+6uf62RQ2SlVdg2ufanEL2uCYYPt4sD
 kDbybursvyqf1IW4LF0vP2KznQE2Hckj6FeACYw32HrlQT6UzX7nbu2TdA==
 =70MJ
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* Fixes for SGX
* force_rcu notifiers

# gpg: Signature made Wed 10 Nov 2021 10:57:48 PM CET
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu:
  sgx: Reset the vEPC regions during VM reboot
  numa: avoid crash with SGX and "info numa"
  accel/tcg: Register a force_rcu notifier
  rcu: Introduce force_rcu notifier
  target/i386: sgx: mark device not user creatable

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-11-11 09:56:22 +01:00
Greg Kurz ef149763a8 rcu: Introduce force_rcu notifier
The drain_rcu_call() function can be blocked as long as an RCU reader
stays in a read-side critical section. This is typically what happens
when a TCG vCPU is executing a busy loop. It can deadlock the QEMU
monitor as reported in https://gitlab.com/qemu-project/qemu/-/issues/650 .

This can be avoided by allowing drain_rcu_call() to enforce an RCU grace
period. Since each reader might need to do specific actions to end a
read-side critical section, do it with notifiers.

Prepare ground for this by adding a notifier list to the RCU reader
struct and use it in wait_for_readers() if drain_rcu_call() is in
progress. An API is added for readers to register their notifiers.

This is largely based on a draft from Paolo Bonzini.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211109183523.47726-2-groug@kaod.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-10 13:20:15 +01:00
Luis Pires e06049f380 host-utils: Introduce mulu128
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211029192417.400707-6-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-11-09 10:32:52 +11:00
John Snow 450e0f28a4 docs: remove non-reference uses of single backticks
The single backtick markup in ReST is the "default role". Currently,
Sphinx's default role is called "content". Sphinx suggests you can use
the "Any" role instead to turn any single-backtick enclosed item into a
cross-reference.

This is useful for things like autodoc for Python docstrings, where it's
often nicer to reference other types with `foo` instead of the more
laborious :py:meth:`foo`. It's also useful in multi-domain cases to
easily reference definitions from other Sphinx domains, such as
referencing C code definitions from outside of kerneldoc comments.

Before we do that, though, we'll need to turn all existing usages of the
"content" role to inline verbatim markup wherever it does not correctly
resolve into a cross-refernece by using double backticks instead.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20211004215238.1523082-2-jsnow@redhat.com>
2021-11-08 12:27:23 +04:00
Alex Bennée 357af9be5c plugins: try and make plugin_insn_append more ergonomic
Currently we make the assumption that the guest frontend loads all
op code bytes sequentially. This mostly holds up for regular fixed
encodings but some architectures like s390x like to re-read the
instruction which causes weirdness to occur. Rather than changing the
frontends make the plugin API a little more ergonomic and able to
handle the re-read case.

Stuff will still get strange if we read ahead of the opcode but so far
no front ends have done that and this patch asserts the case so we can
catch it early if they do.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211026102234.3961636-21-alex.bennee@linaro.org>
2021-11-04 10:32:01 +00:00
Eugenio Pérez a89b34be5e util: Make some iova_tree parameters const
As qemu guidelines:
Unless a pointer is used to modify the pointed-to storage, give it the
"const" attribute.

In the particular case of iova_tree_find it allows to enforce what is
requested by its comment, since the compiler would shout in case of
modifying or freeing the const-qualified returned pointer.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211013182713.888753-2-eperezma@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-02 15:57:21 +01:00
Luis Pires 40f3e79a86 host-utils: add 128-bit quotient support to divu128/divs128
These will be used to implement new decimal floating point
instructions from Power ISA 3.1.

The remainder is now returned directly by divu128/divs128,
freeing up phigh to receive the high 64 bits of the quotient.

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211025191154.350831-4-luis.pires@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-27 17:10:00 -07:00
Luis Pires 8ac2d6c526 host-utils: move udiv_qrnnd() to host-utils
Move udiv_qrnnd() from include/fpu/softfloat-macros.h to host-utils,
so it can be reused by divu128().

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211025191154.350831-3-luis.pires@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-27 17:10:00 -07:00
Luis Pires 9276a31c34 host-utils: move checks out of divu128/divs128
In preparation for changing the divu128/divs128 implementations
to allow for quotients larger than 64 bits, move the div-by-zero
and overflow checks to the callers.

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20211025191154.350831-2-luis.pires@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-27 17:10:00 -07:00
Frédéric Pétrot 1c46937358 qemu/int128: Add int128_{not,xor}
Addition of not and xor on 128-bit integers.

Signed-off-by: Frédéric Pétrot <frederic.petrot@univ-grenoble-alpes.fr>
Co-authored-by: Fabien Portas <fabien.portas@grenoble-inp.org>
Message-Id: <20211025122818.168890-3-frederic.petrot@univ-grenoble-alpes.fr>
[rth: Split out logical operations.]
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-27 17:10:00 -07:00
Richard Henderson 14f12119aa mirror: Handle errors after READY cancel
v2: add small fix by Stefano, Hanna's series fixed
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEi5wmzbL9FHyIDoahVh8kwfGfefsFAmFfEVMACgkQVh8kwfGf
 eftAVA//WLtOaiVYPSjEl5EK80kry39VknZkQyeYUzyV7JNr/FRMlbJaIF2sOjH5
 KPRpfwBiuijOc8R0s34HY0BpyweRd1rbypHblZkO7EO4XwHx1FLF5kNHF6yV7wPL
 c9W564sZpc6Z96wSMgC4Is9QHJ6JbO4TJNJsG8v/PHEqGQV/yCYkgBox4loJckww
 uSAZ7l63IWA8uPSq/rOu34bREKN9s0kHkvFq0JNWk2HtOBLDiRDUYmbSfdjfT4jz
 np7ojKiffcAJED9JA28Zo2Y+MSId+FyoO4lbt+deMNzIHboy2oVlHouoHHprr61x
 dIO7Qt1IoMk5IBIfkPRYkReMwxxSVKuIJcWm8Qqtkcg2X0g5ayNUmHwpBMd50h2z
 XPjrr0YdOixhxMHoBnqlkPlWU0Y/B+YJIQ+mjqp+vRNkk94NoXhsXnCod1ajkgWO
 zjc/dztew7HvNStJaMM0rnEjanLhzFZKtlMO4WwZHQp2yZG2AINkPStswo2f3AmL
 FI+2By/UhFKm3BEemf0wYWDPWrPHU+BOiu16KjSKeS0GA9t7GXBUDRxNYPhUheXJ
 eJKIpNsGbseNxKrAbLyRhAB75Fa/ReZqqybmEcLyal/ball3R/cNF3gaMHeX0o1n
 HTGIAF5JOAXNGApS5TilkXPZ7jHFOVPh/Fi6/16/08tcgxjVfro=
 =TVTu
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vsementsov/tags/pull-jobs-2021-10-07-v2' into staging

mirror: Handle errors after READY cancel
v2: add small fix by Stefano, Hanna's series fixed

# gpg: Signature made Thu 07 Oct 2021 08:25:07 AM PDT
# gpg:                using RSA key 8B9C26CDB2FD147C880E86A1561F24C1F19F79FB
# gpg: Good signature from "Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>" [unknown]
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8B9C 26CD B2FD 147C 880E  86A1 561F 24C1 F19F 79FB

* remotes/vsementsov/tags/pull-jobs-2021-10-07-v2:
  iotests: Add mirror-ready-cancel-error test
  mirror: Do not clear .cancelled
  mirror: Stop active mirroring after force-cancel
  mirror: Check job_is_cancelled() earlier
  mirror: Use job_is_cancelled()
  job: Add job_cancel_requested()
  job: Do not soft-cancel after a job is done
  jobs: Give Job.force_cancel more meaning
  job: @force parameter for job_cancel_sync()
  job: Force-cancel jobs in a failed transaction
  mirror: Drop s->synced
  mirror: Keep s->synced on error
  job: Context changes in job_completed_txn_abort()
  block/aio_task: assert `max_busy_tasks` is greater than 0
  block/backup: avoid integer overflow of `max-workers`

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-07 10:26:35 -07:00
Hanna Reitz 08b83bff2a job: Add job_cancel_requested()
Most callers of job_is_cancelled() actually want to know whether the job
is on its way to immediate termination.  For example, we refuse to pause
jobs that are cancelled; but this only makes sense for jobs that are
really actually cancelled.

A mirror job that is cancelled during READY with force=false should
absolutely be allowed to pause.  This "cancellation" (which is actually
a kind of completion) may take an indefinite amount of time, and so
should behave like any job during normal operation.  For example, with
on-target-error=stop, the job should stop on write errors.  (In
contrast, force-cancelled jobs should not get write errors, as they
should just terminate and not do further I/O.)

Therefore, redefine job_is_cancelled() to only return true for jobs that
are force-cancelled (which as of HEAD^ means any job that interprets the
cancellation request as a request for immediate termination), and add
job_cancel_requested() as the general variant, which returns true for
any jobs which have been requested to be cancelled, whether it be
immediately or after an arbitrarily long completion phase.

Finally, here is a justification for how different job_is_cancelled()
invocations are treated by this patch:

- block/mirror.c (mirror_run()):
  - The first invocation is a while loop that should loop until the job
    has been cancelled or scheduled for completion.  What kind of cancel
    does not matter, only the fact that the job is supposed to end.

  - The second invocation wants to know whether the job has been
    soft-cancelled.  Calling job_cancel_requested() is a bit too broad,
    but if the job were force-cancelled, we should leave the main loop
    as soon as possible anyway, so this should not matter here.

  - The last two invocations already check force_cancel, so they should
    continue to use job_is_cancelled().

- block/backup.c, block/commit.c, block/stream.c, anything in tests/:
  These jobs know only force-cancel, so there is no difference between
  job_is_cancelled() and job_cancel_requested().  We can continue using
  job_is_cancelled().

- job.c:
  - job_pause_point(), job_yield(), job_sleep_ns(): Only force-cancelled
    jobs should be prevented from being paused.  Continue using job_is_cancelled().

  - job_update_rc(), job_finalize_single(), job_finish_sync(): These
    functions are all called after the job has left its main loop.  The
    mirror job (the only job that can be soft-cancelled) will clear
    .cancelled before leaving the main loop if it has been
    soft-cancelled.  Therefore, these functions will observe .cancelled
    to be true only if the job has been force-cancelled.  We can
    continue to use job_is_cancelled().
    (Furthermore, conceptually, a soft-cancelled mirror job should not
    report to have been cancelled.  It should report completion (see
    also the block-job-cancel QAPI documentation).  Therefore, it makes
    sense for these functions not to distinguish between a
    soft-cancelled mirror job and a job that has completed as normal.)

  - job_completed_txn_abort(): All jobs other than @job have been
    force-cancelled.  job_is_cancelled() must be true for them.
    Regarding @job itself: job_completed_txn_abort() is mostly called
    when the job's return value is not 0.  A soft-cancelled mirror has a
    return value of 0, and so will not end up here then.
    However, job_cancel() invokes job_completed_txn_abort() if the job
    has been deferred to the main loop, which is mostly the case for
    completed jobs (which skip the assertion), but not for sure.
    To be safe, use job_cancel_requested() in this assertion.

  - job_complete(): This is function eventually invoked by the user
    (through qmp_block_job_complete() or qmp_job_complete(), or
    job_complete_sync(), which comes from qemu-img).  The intention here
    is to prevent a user from invoking job-complete after the job has
    been cancelled.  This should also apply to soft cancelling: After a
    mirror job has been soft-cancelled, the user should not be able to
    decide otherwise and have it complete as normal (i.e. pivoting to
    the target).

  - job_cancel(): Both functions are equivalent (see comment there), but
    we want to use job_is_cancelled(), because this shows that we call
    job_completed_txn_abort() only for force-cancelled jobs.  (As
    explained for job_update_rc(), soft-cancelled jobs should be treated
    as if they have completed as normal.)

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211006151940.214590-9-hreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-10-07 10:42:40 +02:00
Hanna Reitz 73895f3838 jobs: Give Job.force_cancel more meaning
We largely have two cancel modes for jobs:

First, there is actual cancelling.  The job is terminated as soon as
possible, without trying to reach a consistent result.

Second, we have mirror in the READY state.  Technically, the job is not
really cancelled, but it just is a different completion mode.  The job
can still run for an indefinite amount of time while it tries to reach a
consistent result.

We want to be able to clearly distinguish which cancel mode a job is in
(when it has been cancelled).  We can use Job.force_cancel for this, but
right now it only reflects cancel requests from the user with
force=true, but clearly, jobs that do not even distinguish between
force=false and force=true are effectively always force-cancelled.

So this patch has Job.force_cancel signify whether the job will
terminate as soon as possible (force_cancel=true) or whether it will
effectively remain running despite being "cancelled"
(force_cancel=false).

To this end, we let jobs that provide JobDriver.cancel() tell the
generic job code whether they will terminate as soon as possible or not,
and for jobs that do not provide that method we assume they will.

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20211006151940.214590-7-hreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-10-07 10:42:34 +02:00
Hanna Reitz 4cfb3f0562 job: @force parameter for job_cancel_sync()
Callers should be able to specify whether they want job_cancel_sync() to
force-cancel the job or not.

In fact, almost all invocations do not care about consistency of the
result and just want the job to terminate as soon as possible, so they
should pass force=true.  The replication block driver is the exception,
specifically the active commit job it runs.

As for job_cancel_sync_all(), all callers want it to force-cancel all
jobs, because that is the point of it: To cancel all remaining jobs as
quickly as possible (generally on process termination).  So make it
invoke job_cancel_sync() with force=true.

This changes some iotest outputs, because quitting qemu while a mirror
job is active will now lead to it being cancelled instead of completed,
which is what we want.  (Cancelling a READY mirror job with force=false
may take an indefinite amount of time, which we do not want when
quitting.  If users want consistent results, they must have all jobs be
done before they quit qemu.)

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/462
Signed-off-by: Hanna Reitz <hreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20211006151940.214590-6-hreitz@redhat.com>
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
2021-10-07 10:42:09 +02:00
Richard Henderson 37aff08726 plugins: Reorg arguments to qemu_plugin_vcpu_mem_cb
Use the MemOpIdx directly, rather than the rearrangement
of the same bits currently done by the trace infrastructure.
Pass in enum qemu_plugin_mem_rw so that we are able to treat
read-modify-write operations as a single operation.

Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-10-05 16:53:17 -07:00
Luis Pires d03bba0bfb host-utils: introduce uabs64()
Introduce uabs64(), a function that returns the absolute value of
a 64-bit int as an unsigned value. This avoids the undefined behavior
for common abs implementations, where abs of the most negative value is
undefined.

Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20210910112624.72748-4-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-29 19:37:38 +10:00
Luis Pires 4ff2a971f4 host-utils: fix missing zero-extension in divs128
*plow (lower 64 bits of the dividend) is passed into divs128() as
a signed 64-bit integer. When building an __int128_t from it, it
must be zero-extended, instead of sign-extended.

Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Luis Pires <luis.pires@eldorado.org.br>
Message-Id: <20210910112624.72748-3-luis.pires@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-29 19:37:38 +10:00
Philippe Mathieu-Daudé 521b97cd4e util/vfio-helpers: Pass Error handle to qemu_vfio_dma_map()
Currently qemu_vfio_dma_map() displays errors on stderr.
When using management interface, this information is simply
lost. Pass qemu_vfio_dma_map() an Error** handle so it can
propagate the error to callers.

Reviewed-by: Fam Zheng <fam@euphon.net>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210902070025.197072-7-philmd@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2021-09-07 09:08:24 +01:00
Mahmoud Mandour 6a9e8a086d plugins/api: added a boolean parsing plugin api
This call will help boolean argument parsing since arguments are now
passed to plugins as a name and value.

Signed-off-by: Mahmoud Mandour <ma.mandourr@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20210730135817.17816-3-ma.mandourr@gmail.com>
[AJB: add to symbols]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2021-09-02 11:29:34 +01:00
Matheus Ferst 2484cd9c77 include/qemu/int128.h: introduce bswap128s
Changes the current bswap128 implementation to use __builtin_bswap128
when available, adds a bswap128 implementation for !CONFIG_INT128
builds, and introduces bswap128s based on bswap128.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20210826145656.2507213-2-matheus.ferst@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-08-27 12:43:11 +10:00
Matheus Ferst 181b0c333d include/qemu/int128.h: define struct Int128 according to the host endianness
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br>
Message-Id: <20210826141446.2488609-2-matheus.ferst@eldorado.org.br>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-08-27 12:41:47 +10:00
Mark Cave-Ayland 2f0e10a486 bitops.h: revert db1ffc32dd ("qemu/bitops.h: add bitrev8 implementation")
Commit db1ffc32dd ("qemu/bitops.h: add bitrev8 implementation") introduced
a bitrev8() function to reverse the bit ordering required for storing the
MAC address in the q800 PROM.

This function is not required since QEMU implements its own revbit8()
function which does exactly the same thing. Remove the extraneous
bitrev8() function and switch its only caller in hw/m68k/q800.c to
use revbit8() instead.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210725110557.3007-1-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-07-26 06:56:41 -10:00
Alex Bennée f7e68c9c99 tcg/plugins: implement a qemu_plugin_user_exit helper
In user-mode emulation there is a small race between preexit_cleanup
and exit_group() which means we may end up calling instrumented
instructions before the kernel reaps child threads. To solve this we
implement a new helper which ensures the callbacks are flushed along
with any translations before we let the host do it's a thing.

While we are at it make the documentation of
qemu_plugin_register_atexit_cb clearer as to what the user can expect.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Mahmoud Mandour <ma.mandourr@gmail.com>
Acked-by: Warner Losh <imp@bsdimp.com>
Message-Id: <20210720232703.10650-21-alex.bennee@linaro.org>
2021-07-23 17:22:16 +01:00
Richard Henderson 9ef0c6d6a7 qemu/atomic: Add aligned_{int64,uint64}_t types
Use it to avoid some clang-12 -Watomic-alignment errors,
forcing some structures to be aligned and as a pointer when
we have ensured that the address is aligned.

Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-07-21 07:45:38 -10:00
Richard Henderson 47345e7124 qemu/atomic: Remove pre-C11 atomic fallbacks
We now require c11, so the fallbacks are now dead code

Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-07-21 07:45:38 -10:00
Richard Henderson 952fd6710e qemu/atomic: Use macros for CONFIG_ATOMIC64
Clang warnings about questionable atomic usage get localized
to the inline function in atomic.h.  By using a macro, we get
the full traceback to the original use that caused the warning.

Tested-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2021-07-21 07:45:38 -10:00
Alex Bennée 2d93203998 plugins: fix-up handling of internal hostaddr for 32 bit
The compiler rightly complains when we build on 32 bit that casting
uint64_t into a void is a bad idea. We are really dealing with a host
pointer at this point so treat it as such. This does involve
a uintptr_t cast of the result of the TLB addend as we know that has
to point to the host memory.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210709143005.1554-28-alex.bennee@linaro.org>
2021-07-14 14:33:53 +01:00
Gerd Hoffmann d7795d3cc5 modules: check arch and block load on mismatch
Add module_allow_arch() to set the target architecture.
In case a module is limited to some arch verify arches
match and ignore the module if not.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-19-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-09 18:20:27 +02:00
Gerd Hoffmann 5ebbfecc3e modules: generate modinfo.c
Add script to generate C source with a small
database containing the module meta-data.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-4-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-09 18:20:27 +02:00
Gerd Hoffmann 22524c10c4 modules: add modinfo macros
Add macros for module info annotations.

Instead of having that module meta-data stored in lists in util/module.c
place directly in the module source code.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Jose R. Ziviani <jziviani@suse.de>
Message-Id: <20210624103836.2382472-2-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-09 18:20:22 +02:00
Paolo Bonzini 7db492a1b6 osdep: fix HAVE_BROKEN_SIZE_MAX case
While config-host.mak entries are expanded to "1" for compatibility with
create-config.sh, tests done directly in meson.build expand to the empty
string and cannot be placed to the right of the && operator.  Adjust
osdep.h after commit e46bd55d9c ("configure: convert HAVE_BROKEN_SIZE_MAX
to meson", 2021-07-06) changed the way HAVE_BROKEN_SIZE_MAX is defined.

Reported-by: Frederic Bezies <fredbezies@gmail.com>
Fixes: e46bd55d9c ("configure: convert HAVE_BROKEN_SIZE_MAX to meson", 2021-07-06)
Resolves: #463
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-09 18:19:00 +02:00
Peter Maydell 53c0123118 Pull request
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAmDm+YkACgkQnKSrs4Gr
 c8jkfQf+PuiSrU9t5MjU//jKPBpx/Jl9FqsKIlWC+6NOYQFhzD7A7rftuOSG1HOW
 Cil9rWO/57u/i7DcrABRlMkZ17zv6tP0876LRXRybVnjX4pQ4ayEtMio4WdjAna+
 1hL9P+c6tuo6wlGkM+R2yX4QIOSw+74Qbkh8sLrVwGYzA94erXbuJi/iix8t11iR
 joiNZ/k6yZPyMbzgT11A0On5qdaNNVfQeA59pyQEnucia02285VkCPLMaO+Trkgl
 7VQ1V3cNAdWVaPU0kXDCx595K1BlfrGHNimxfq2UeUZd2AvqHCohlG/egrEHfOu2
 Ek4dAqXn6p2eufGi/BzQi8V4J0nPRA==
 =SDmL
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha-gitlab/tags/block-pull-request' into staging

Pull request

# gpg: Signature made Thu 08 Jul 2021 14:11:37 BST
# gpg:                using RSA key 8695A8BFD3F97CDAAC35775A9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>" [full]
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>" [full]
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha-gitlab/tags/block-pull-request:
  block/io: Merge discard request alignments
  block: Add backend_defaults property
  block/file-posix: Optimize for macOS
  util/async: print leaked BH name when AioContext finalizes
  util/async: add a human-readable name to BHs for debugging

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-07-08 22:17:28 +01:00
Paolo Bonzini 904806c69b qemu-option: remove now-dead code
-M was the sole user of qemu_opts_set and qemu_opts_set_defaults,
remove them and the arguments that they used.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini c445909e1f keyval: introduce keyval_parse_into
Allow parsing multiple keyval sequences into the same dictionary.
This will be used to simplify the parsing of the -M command line
option, which is currently a .merge_lists = true QemuOpts group.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00
Paolo Bonzini 9176e800db keyval: introduce keyval_merge
This patch introduces a function that merges two keyval-produced
(or keyval-like) QDicts.  It can be used to emulate the behavior of
.merge_lists = true QemuOpts groups, merging -readconfig sections and
command-line options in a single QDict, and also to implement -set.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-06 08:33:51 +02:00