The polymorphic locking macros don't support QemuRecMutex yet. Add it
so that lock guards can be used with QemuRecMutex.
Convert TCG plugins functions that benefit from these macros. Manual
qemu_rec_mutex_lock/unlock() callers are left unmodified in cases where
clarity would not improve by switching to the macros.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch introduces two lock guard macros that automatically unlock a
lock object (QemuMutex and others):
void f(void) {
QEMU_LOCK_GUARD(&mutex);
if (!may_fail()) {
return; /* automatically unlocks mutex */
}
...
}
and:
WITH_QEMU_LOCK_GUARD(&mutex) {
if (!may_fail()) {
return; /* automatically unlocks mutex */
}
}
/* automatically unlocks mutex here */
...
Convert qemu-timer.c functions that benefit from these macros as an
example. Manual qemu_mutex_lock/unlock() callers are left unmodified in
cases where clarity would not improve by switching to the macros.
Many other QemuMutex users remain in the codebase that might benefit
from lock guards. Over time they can be converted, if that is
desirable.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[Use QEMU_MAKE_LOCKABLE_NONNULL. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will be needed for lock guards, because if the lock is NULL the
dummy for loop of the lock guard never runs. This can cause confusion
and dummy warnings in the compiler, but even if it did not, aborting
with a NULL pointer dereference is a less surprising behavior.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Traditionally, the uint-specific property helpers only offer getters.
When adding object (or class) uint types, one must therefore use the
generic property helper if a setter is needed (and probably duplicate
some code writing their own getters/setters).
This enhances the uint-specific property helper APIs by adding a
bitwise-or'd 'flags' field and modifying all clients of that API to set
this paramater to OBJ_PROP_FLAG_READ. This maintains the current
behaviour whilst allowing others to also set OBJ_PROP_FLAG_WRITE (or use
the more convenient OBJ_PROP_FLAG_READWRITE) in the future (which will
automatically install a setter). Other flags may be added later.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, TSC is set as part of the VM runtime state. Setting TSC at
runtime is heavy and additionally can have side effects on the guest,
which are not very resilient to variances in the TSC. This patch uses
the VM state to determine whether to set TSC or not. Some minor
enhancements for getting TSC values as well that considers the VM state.
Additionally, while setting the TSC, the partition is suspended to
reduce the variance in the TSC value across vCPUs.
Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Message-Id: <SN4PR2101MB08804D23439166E81FF151F7C0EA0@SN4PR2101MB0880.namprd21.prod.outlook.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
And intialize buffer_is_zero() with it, when Intel AVX512F is
available on host.
This function utilizes Intel AVX512 fundamental instructions which
is faster than its implementation with AVX2 (in my unit test, with
4K buffer, on CascadeLake SP, ~36% faster, buffer_zero_avx512() V.S.
buffer_zero_avx2()).
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Description copied from Linux kernel commit from Gustavo A. R. Silva
(see [3]):
--v-- description start --v--
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to
declare variable-length types such as these ones is a flexible
array member [1], introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler
warning in case the flexible array does not occur last in the
structure, which will help us prevent some kind of undefined
behavior bugs from being unadvertenly introduced [2] to the
Linux codebase from now on.
--^-- description end --^--
Do the similar housekeeping in the QEMU codebase (which uses
C99 since commit 7be41675f7).
All these instances of code were found with the help of the
following command (then manual analysis, without modifying
structures only having a single flexible array member, such
QEDTable in block/qed.h):
git grep -F '[0];'
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=76497732932f
[3] https://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux.git/commit/?id=17642a2fbd2c1
Inspired-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Description copied from Linux kernel commit from Gustavo A. R. Silva
(see [3]):
--v-- description start --v--
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to
declare variable-length types such as these ones is a flexible
array member [1], introduced in C99:
struct foo {
int stuff;
struct boo array[];
};
By making use of the mechanism above, we will get a compiler
warning in case the flexible array does not occur last in the
structure, which will help us prevent some kind of undefined
behavior bugs from being unadvertenly introduced [2] to the
Linux codebase from now on.
--^-- description end --^--
Do the similar housekeeping in the QEMU codebase (which uses
C99 since commit 7be41675f7).
All these instances of code were found with the help of the
following Coccinelle script:
@@
identifier s, m, a;
type t, T;
@@
struct s {
...
t m;
- T a[0];
+ T a[];
};
@@
identifier s, m, a;
type t, T;
@@
struct s {
...
t m;
- T a[0];
+ T a[];
} QEMU_PACKED;
[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=76497732932f
[3] https://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux.git/commit/?id=17642a2fbd2c1
Inspired-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Fix various bugs that might result in an assert() due to
incorrect hflags for M-profile CPUs
* Fix Aspeed SMC Controller user-mode select handling
* Report correct (with-tag) address in fault address register
when TBI is enabled
* cubieboard: make sure SOC object isn't leaked
* fsl-imx25: Wire up eSDHC controllers
* fsl-imx25: Wire up USB controllers
* New board model: orangepi-pc (OrangePi PC)
* ARM/KVM: if user doesn't select GIC version and the
host kernel can only provide GICv3, use that, rather
than defaulting to "fail because GICv2 isn't possible"
* kvm: Only do KVM_SET_VCPU_EVENTS at the last stage of sync
-----BEGIN PGP SIGNATURE-----
iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAl5qZsIZHHBldGVyLm1h
eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3krfD/40xprKOtpel6si3edDQsw5
j6LamqJDvaUdtG713OjR6yjvvQiXCw9yCDlfGhBlLhLW1t0aGKrrZoRC4CNMOt0J
34WAcDP0iz3ALEwpfNfr/DWFwiGjamabrRsGcq08Q42G7+UA7FhUEvL25ZApfRFy
8O2gs3bDv30pfa5oJwYJhvSHcNeR9YKueK+WGw16gRkSoNbjUxnSpyiGnxbMaNpg
aL+ZQzQ0BOAyeOg/0LUdZ9meAvWwR0NppgK0ujJxq68/6tPz8tv2+pQgllNYSQRO
vDr4mj6MlJNW62M5IAKRZ/6zTz34+7UYQ7mTK2VTWt2qtfrANz+EpcDljtc/8EIF
lAVd1W099DNdqgFcUGZzoWyRbmjz9B76WTJ43orY5AbMZN5l4XwAGItkE6yQbqKd
kqPKP2ICFj/0JhgBoTzo0J/5wV2izZKKnih990IJU390oWoiVRbdWlQDJ2ujQ3AV
havWhR/tL399K1UZl8act/J9rifq9J3mbiqpx2XEEiFMu93FDNCPtJioix1Swvpx
ERMB9VA6JNAHZ6oAGgNmTHG3nSJtpcin8XxR5YcKWSYiksPjkce1sEwtRbyxBHtq
jb/yk5mjyrbYTy3Gmg85/Fh74XnELsnwmADFdezHXUu4EPxth/ssCuXlXs8DIciI
sWGFVJpDoWSXEqi4FjvhIQ==
=LLqm
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20200312' into staging
target-arm queue:
* Fix various bugs that might result in an assert() due to
incorrect hflags for M-profile CPUs
* Fix Aspeed SMC Controller user-mode select handling
* Report correct (with-tag) address in fault address register
when TBI is enabled
* cubieboard: make sure SOC object isn't leaked
* fsl-imx25: Wire up eSDHC controllers
* fsl-imx25: Wire up USB controllers
* New board model: orangepi-pc (OrangePi PC)
* ARM/KVM: if user doesn't select GIC version and the
host kernel can only provide GICv3, use that, rather
than defaulting to "fail because GICv2 isn't possible"
* kvm: Only do KVM_SET_VCPU_EVENTS at the last stage of sync
# gpg: Signature made Thu 12 Mar 2020 16:43:46 GMT
# gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg: issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate]
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20200312: (36 commits)
target/arm: kvm: Inject events at the last stage of sync
hw/arm/virt: kvm: allow gicv3 by default if v2 cannot work
hw/arm/virt: kvm: Restructure finalize_gic_version()
target/arm/kvm: Let kvm_arm_vgic_probe() return a bitmap
hw/arm/virt: Introduce finalize_gic_version()
hw/arm/virt: Introduce VirtGICType enum type
hw/arm/virt: Document 'max' value in gic-version property description
docs: add Orange Pi PC document
tests/boot_linux_console: Test booting NetBSD via U-Boot on OrangePi PC
tests/boot_linux_console: Add a SLOW test booting Ubuntu on OrangePi PC
tests/boot_linux_console: Add a SD card test for the OrangePi PC board
tests/boot_linux_console: Add initrd test for the Orange Pi PC board
tests/boot_linux_console: Add a quick test for the OrangePi PC board
hw/arm/allwinner: add RTC device support
hw/arm/allwinner-h3: add SDRAM controller device
hw/arm/allwinner-h3: add Boot ROM support
hw/arm/allwinner-h3: add EMAC ethernet device
hw/arm/allwinner: add SD/MMC host controller
hw/arm/allwinner: add Security Identifier device
hw/arm/allwinner: add CPU Configuration module
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Let's move the code which freezes which gic-version to
be applied in a dedicated function. We also now set by
default the VIRT_GIC_VERSION_NO_SET. This eventually
turns into the legacy v2 choice in the finalize() function.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 20200311131618.7187-4-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We plan to introduce yet another value for the gic version (nosel).
As we already use exotic values such as 0 and -1, let's introduce
a dedicated enum type and let vms->gic_version take this
type.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 20200311131618.7187-3-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Allwinner System-on-Chips usually contain a Real Time Clock (RTC)
for non-volatile system date and time keeping. This commit adds a generic
Allwinner RTC device that supports the RTC devices found in Allwinner SoC
family sun4i (A10), sun7i (A20) and sun6i and newer (A31, H2+, H3, etc).
The following RTC functionality and features are implemented:
* Year-Month-Day read/write
* Hour-Minute-Second read/write
* General Purpose storage
The following boards are extended with the RTC device:
* Cubieboard (hw/arm/cubieboard.c)
* Orange Pi PC (hw/arm/orangepi.c)
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200311221854.30370-13-nieklinnenbank@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the Allwinner H3 SoC the SDRAM controller is responsible
for interfacing with the external Synchronous Dynamic Random
Access Memory (SDRAM). Types of memory that the SDRAM controller
supports are DDR2/DDR3 and capacities of up to 2GiB. This commit
adds emulation support of the Allwinner H3 SDRAM controller.
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200311221854.30370-12-nieklinnenbank@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A real Allwinner H3 SoC contains a Boot ROM which is the
first code that runs right after the SoC is powered on.
The Boot ROM is responsible for loading user code (e.g. a bootloader)
from any of the supported external devices and writing the downloaded
code to internal SRAM. After loading the SoC begins executing the code
written to SRAM.
This commits adds emulation of the Boot ROM firmware setup functionality
by loading user code from SD card in the A1 SRAM. While the A1 SRAM is
64KiB, we limit the size to 32KiB because the real H3 Boot ROM also rejects
sizes larger than 32KiB. For reference, this behaviour is documented
by the Linux Sunxi project wiki at:
https://linux-sunxi.org/BROM#U-Boot_SPL_limitations
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200311221854.30370-11-nieklinnenbank@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Allwinner Sun8i System on Chip family includes an Ethernet MAC (EMAC)
which provides 10M/100M/1000M Ethernet connectivity. This commit
adds support for the Allwinner EMAC from the Sun8i family (H2+, H3, A33, etc),
including emulation for the following functionality:
* DMA transfers
* MII interface
* Transmit CRC calculation
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200311221854.30370-10-nieklinnenbank@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Allwinner System on Chip families sun4i and above contain
an integrated storage controller for Secure Digital (SD) and
Multi Media Card (MMC) interfaces. This commit adds support
for the Allwinner SD/MMC storage controller with the following
emulated features:
* DMA transfers
* Direct FIFO I/O
* Short/Long format command responses
* Auto-Stop command (CMD12)
* Insert & remove card detection
The following boards are extended with the SD host controller:
* Cubieboard (hw/arm/cubieboard.c)
* Orange Pi PC (hw/arm/orangepi.c)
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200311221854.30370-9-nieklinnenbank@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Security Identifier device found in various Allwinner System on Chip
designs gives applications a per-board unique identifier. This commit
adds support for the Allwinner Security Identifier using a 128-bit
UUID value as input.
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200311221854.30370-8-nieklinnenbank@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Various Allwinner System on Chip designs contain multiple processors
that can be configured and reset using the generic CPU Configuration
module interface. This commit adds support for the Allwinner CPU
configuration interface which emulates the following features:
* CPU reset
* CPU status
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20200311221854.30370-7-nieklinnenbank@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Allwinner H3 System on Chip has an System Control
module that provides system wide generic controls and
device information. This commit adds support for the
Allwinner H3 System Control module.
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200311221854.30370-6-nieklinnenbank@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Allwinner H3 System on Chip contains multiple USB 2.0 bus
connections which provide software access using the Enhanced
Host Controller Interface (EHCI) and Open Host Controller
Interface (OHCI) interfaces. This commit adds support for
both interfaces in the Allwinner H3 System on Chip.
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200311221854.30370-5-nieklinnenbank@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Clock Control Unit is responsible for clock signal generation,
configuration and distribution in the Allwinner H3 System on Chip.
This commit adds support for the Clock Control Unit which emulates
a simple read/write register interface.
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200311221854.30370-4-nieklinnenbank@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Allwinner H3 is a System on Chip containing four ARM Cortex A7
processor cores. Features and specifications include DDR2/DDR3 memory,
SD/MMC storage cards, 10/100/1000Mbit Ethernet, USB 2.0, HDMI and
various I/O modules. This commit adds support for the Allwinner H3
System on Chip.
Signed-off-by: Niek Linnenbank <nieklinnenbank@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20200311221854.30370-2-nieklinnenbank@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
i.MX25 supports two USB controllers. Let's wire them up.
With this patch, imx25-pdk can boot from both USB ports.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20200310215146.19688-3-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Wire up eSDHC controllers in fsl-imx25. For imx25-pdk, connect drives
provided on the command line to available eSDHC controllers.
This patch enables booting the imx25-pdk emulation from SD card.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 20200310215146.19688-2-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: made commit subject consistent with other patch]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Using the new 'bdrv_co_delete_file' interface, a pure co_routine function
'bdrv_co_delete_file' inside block.c can can be used in a way similar of
the existing bdrv_create_file to to clean up a created file.
We're creating a pure co_routine because the only caller of
'bdrv_co_delete_file' will be already in co_routine context, thus there
is no need to add all the machinery to check for qemu_in_coroutine() and
create a separated co_routine to do the job.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200130213907.2830642-3-danielhb413@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Adding to Block Drivers the capability of being able to clean up
its created files can be useful in certain situations. For the
LUKS driver, for instance, a failure in one of its authentication
steps can leave files in the host that weren't there before.
This patch adds the 'bdrv_co_delete_file' interface to block
drivers and add it to the 'file' driver in file-posix.c. The
implementation is given by 'raw_co_delete_file'.
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Message-Id: <20200130213907.2830642-2-danielhb413@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <20200310113831.27293-2-kwolf@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Hide structure definitions and add explicit API instead, to keep an
eye on the scope of the shared fields.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200311103004.7649-10-vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
offset/bytes pair is more usual naming in block layer, let's use it.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200311103004.7649-8-vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
We have a lot of "chunk_end - start" invocations, let's switch to
bytes/cur_bytes scheme instead.
While being here, improve check on block_copy_do_copy parameters to not
overflow when calculating nbytes and use int64_t for bytes in
block_copy for consistency.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200311103004.7649-7-vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Assume we have two regions, A and B, and region B is in-flight now,
region A is not yet touched, but it is unallocated and should be
skipped.
Correspondingly, as progress we have
total = A + B
current = 0
If we reset unallocated region A and call progress_reset_callback,
it will calculate 0 bytes dirty in the bitmap and call
job_progress_set_remaining, which will set
total = current + 0 = 0 + 0 = 0
So, B bytes are actually removed from total accounting. When job
finishes we'll have
total = 0
current = B
, which doesn't sound good.
This is because we didn't considered in-flight bytes, actually when
calculating remaining, we should have set (in_flight + dirty_bytes)
as remaining, not only dirty_bytes.
To fix it, let's refactor progress calculation, moving it to block-copy
itself instead of fixing callback. And, of course, track in_flight
bytes count.
We still have to keep one callback, to maintain backup job bytes_read
calculation, but it will go on soon, when we turn the whole backup
process into one block_copy call.
Cc: qemu-stable@nongnu.org
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Message-Id: <20200311103004.7649-3-vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
We need it in separate to pass to the block-copy object in the next
commit.
Cc: qemu-stable@nongnu.org
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200311103004.7649-2-vsementsov@virtuozzo.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The qcow2 .bdrv_measure() code calculates the crypto payload offset.
This logic really belongs in crypto/block.c where it can be reused by
other image formats.
The "luks" block driver will need this same logic in order to implement
.bdrv_measure(), so extract the qcrypto_block_calculate_payload_offset()
function now.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20200221112522.1497712-2-stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200308092440.23564-12-mlevitsk@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200308092440.23564-11-mlevitsk@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200308092440.23564-10-mlevitsk@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200308092440.23564-9-mlevitsk@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
hmp_snapshot_blkdev is from GPLv2 version of the hmp-cmds.c thus
have to change the licence to GPLv2
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200308092440.23564-8-mlevitsk@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200308092440.23564-7-mlevitsk@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Moved code was added after 2012-01-13, thus under GPLv2+
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200308092440.23564-6-mlevitsk@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixed commit message
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20200308092440.23564-5-mlevitsk@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
These days device-hotplug.c only contains the hmp_drive_add
In the next patch, rest of hmp_drive* functions will be moved
there.
Also add block-hmp-cmds.h to contain prototypes of these
functions
License for block-hmp-cmds.h since it contains the code
moved from sysemu.h which lacks license and thus according
to LICENSE is under GPLv2+
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200308092440.23564-4-mlevitsk@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
When there are many poll handlers it's likely that some of them are idle
most of the time. Remove handlers that haven't had activity recently so
that the polling loop scales better for guests with a large number of
devices.
This feature only takes effect for the Linux io_uring fd monitoring
implementation because it is capable of combining fd monitoring with
userspace polling. The other implementations can't do that and risk
starving fds in favor of poll handlers, so don't try this optimization
when they are in use.
IOPS improves from 10k to 105k when the guest has 100
virtio-blk-pci,num-queues=32 devices and 1 virtio-blk-pci,num-queues=1
device for rw=randread,iodepth=1,bs=4k,ioengine=libaio on NVMe.
[Clarified aio_poll_handlers locking discipline explanation in comment
after discussion with Paolo Bonzini <pbonzini@redhat.com>.
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-8-stefanha@redhat.com
Message-Id: <20200305170806.1313245-8-stefanha@redhat.com>
Unlike ppoll(2) and epoll(7), Linux io_uring completions can be polled
from userspace. Previously userspace polling was only allowed when all
AioHandler's had an ->io_poll() callback. This prevented starvation of
fds by userspace pollable handlers.
Add the FDMonOps->need_wait() callback that enables userspace polling
even when some AioHandlers lack ->io_poll().
For example, it's now possible to do userspace polling when a TCP/IP
socket is monitored thanks to Linux io_uring.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-7-stefanha@redhat.com
Message-Id: <20200305170806.1313245-7-stefanha@redhat.com>
The recent Linux io_uring API has several advantages over ppoll(2) and
epoll(2). Details are given in the source code.
Add an io_uring implementation and make it the default on Linux.
Performance is the same as with epoll(7) but later patches add
optimizations that take advantage of io_uring.
It is necessary to change how aio_set_fd_handler() deals with deleting
AioHandlers since removing monitored file descriptors is asynchronous in
io_uring. fdmon_io_uring_remove() marks the AioHandler deleted and
aio_set_fd_handler() will let it handle deletion in that case.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-6-stefanha@redhat.com
Message-Id: <20200305170806.1313245-6-stefanha@redhat.com>
The AioHandler *node, bool is_new arguments are more complicated to
think about than simply being given AioHandler *old_node, AioHandler
*new_node.
Furthermore, the new Linux io_uring file descriptor monitoring mechanism
added by the new patch requires access to both the old and the new
nodes. Make this change now in preparation.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-5-stefanha@redhat.com
Message-Id: <20200305170806.1313245-5-stefanha@redhat.com>
The ppoll(2) and epoll(7) file descriptor monitoring implementations are
mixed with the core util/aio-posix.c code. Before adding another
implementation for Linux io_uring, extract out the existing
ones so there is a clear interface and the core code is simpler.
The new interface is AioContext->fdmon_ops, a pointer to a FDMonOps
struct. See the patch for details.
Semantic changes:
1. ppoll(2) now reflects events from pollfds[] back into AioHandlers
while we're still on the clock for adaptive polling. This was
already happening for epoll(7), so if it's really an issue then we'll
need to fix both in the future.
2. epoll(7)'s fallback to ppoll(2) while external events are disabled
was broken when the number of fds exceeded the epoll(7) upgrade
threshold. I guess this code path simply wasn't tested and no one
noticed the bug. I didn't go out of my way to fix it but the correct
code is simpler than preserving the bug.
I also took some liberties in removing the unnecessary
AioContext->epoll_available (just check AioContext->epollfd != -1
instead) and AioContext->epoll_enabled (it's implicit if our
AioContext->fdmon_ops callbacks are being invoked) fields.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://lore.kernel.org/r/20200305170806.1313245-4-stefanha@redhat.com
Message-Id: <20200305170806.1313245-4-stefanha@redhat.com>