Commit Graph

770 Commits

Author SHA1 Message Date
Richard Henderson 187ba69453 accel/tcg: Move TLB_WATCHPOINT to TLB_SLOW_FLAGS_MASK
This frees up one bit of the primary tlb flags without
impacting the TLB_NOTDIRTY logic.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:33:00 +02:00
Richard Henderson 58e8f1f616 accel/tcg: Store some tlb flags in CPUTLBEntryFull
We have run out of bits we can use within the CPUTLBEntry comparators,
as TLB_FLAGS_MASK cannot overlap alignment.

Store slow_flags[] in CPUTLBEntryFull, and merge with the flags from
the comparator.  A new TLB_FORCE_SLOW bit is set within the comparator
as an indication that the slow path must be used.

Move TLB_BSWAP to TLB_SLOW_FLAGS_MASK.  Since we are out of bits,
we cannot create a new bit without moving an old one.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:33:00 +02:00
Richard Henderson 97e1576957 accel/tcg: Remove check_tcg_memory_orders_compatible
We now issue host memory barriers to match the guest memory order.
Continue to disable MTTCG only if the guest has not been ported.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:33:00 +02:00
Richard Henderson f86e8f3d13 tcg: Add host memory barriers to cpu_ldst.h interfaces
Bring the helpers into line with the rest of tcg in respecting
guest memory ordering.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:33:00 +02:00
Fei Wu 1b65b4f54c accel/tcg: remove CONFIG_PROFILER
TBStats will be introduced to replace CONFIG_PROFILER totally, here
remove all CONFIG_PROFILER related stuffs first.

Signed-off-by: Vanderson M. do Rosario <vandersonmr2@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fei Wu <fei2.wu@intel.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230607122411.3394702-2-fei2.wu@intel.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:33:00 +02:00
Anton Johansson b1c09220b4 accel/tcg: Replace target_ulong with vaddr in translator_*()
Use vaddr for guest virtual address in translator_use_goto_tb() and
translator_loop().

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230621135633.1649-11-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:33:00 +02:00
Anton Johansson b0326eb999 accel/tcg: Replace target_ulong with vaddr in *_mmu_lookup()
Update atomic_mmu_lookup() and cpu_mmu_lookup() to take the guest
virtual address as a vaddr instead of a target_ulong.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230621135633.1649-10-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:33:00 +02:00
Anton Johansson 4f8f41272e accel: Replace target_ulong with vaddr in probe_*()
Functions for probing memory accesses (and functions that call these)
are updated to take a vaddr for guest virtual addresses over
target_ulong.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230621135633.1649-9-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:32:59 +02:00
Anton Johansson 06f3831c08 accel/tcg: Widen pc to vaddr in CPUJumpCache
Related functions dealing with the jump cache are also updated.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230621135633.1649-8-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:32:59 +02:00
Anton Johansson f0a08b0913 accel/tcg/cpu-exec.c: Widen pc to vaddr
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230621135633.1649-7-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:32:59 +02:00
Anton Johansson fb2c53cb71 accel/tcg/cputlb.c: Widen addr in MMULookupPageData
Functions accessing MMULookupPageData are also updated.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230621135633.1649-6-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:32:59 +02:00
Anton Johansson 9e39de980f accel/tcg/cputlb.c: Widen CPUTLBEntry access functions
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230621135633.1649-5-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:32:59 +02:00
Anton Johansson bb5de52524 target: Widen pc/cs_base in cpu_get_tb_cpu_state
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230621135633.1649-4-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:32:59 +02:00
Anton Johansson 256d11f9ba accel/tcg/translate-all.c: Widen pc and cs_base
Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230621135633.1649-3-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:32:59 +02:00
Anton Johansson 732d548732 accel: Replace target_ulong in tlb_*()
Replaces target_ulong with vaddr for guest virtual addresses in tlb_*()
functions and auxilliary structs.

Signed-off-by: Anton Johansson <anjo@rev.ng>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230621135633.1649-2-anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-26 17:32:59 +02:00
Philippe Mathieu-Daudé a3e7f70229 accel/tcg/cpu-exec: Use generic 'helper-proto-common.h' header
We only need lookup_tb_ptr() prototype.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230611085846.21415-3-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20 10:01:30 +02:00
Philippe Mathieu-Daudé c7b64948f8 meson: Replace CONFIG_SOFTMMU -> CONFIG_SYSTEM_ONLY
Since we *might* have user emulation with softmmu,
use the clearer 'CONFIG_SYSTEM_ONLY' key to check
for system emulation.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-9-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20 10:01:30 +02:00
Philippe Mathieu-Daudé 905db98a73 accel/tcg: Check for USER_ONLY definition instead of SOFTMMU one
Since we *might* have user emulation with softmmu,
replace the system emulation check by !user emulation one.

Invert some if() ladders for clarity.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-7-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20 10:01:30 +02:00
Richard Henderson 2be6a48673 accel/tcg: Handle MO_ATOM_WITHIN16 in do_st16_leN
Otherwise we hit the default assert not reached.
Handle it as MO_ATOM_NONE, because of size and misalignment.
We already handle this correctly in do_ld16_beN.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20 10:01:30 +02:00
Ivan Klokov b84694defb util/log: Add vector registers to log
Added QEMU option 'vpu' to log vector extension registers such as gpr\fpu.

Signed-off-by: Ivan Klokov <ivan.klokov@syntacore.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20230410124451.15929-2-ivan.klokov@syntacore.com>
Signed-off-by: Alistair Francis <alistair.francis@wdc.com>
2023-06-13 17:42:01 +10:00
Richard Henderson c0dde5fc5c accel/tcg: Fix undefined shift in store_whole_le16
The computation is documented as unused in this case,
but triggers an ubsan error:

../accel/tcg/ldst_atomicity.c.inc:837:33: runtime error: shift exponent -32 is negative
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../accel/tcg/ldst_atomicity.c.inc:837:33 in

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20230606171629.98157-1-richard.henderson@linaro.org>
2023-06-06 12:11:02 -07:00
Paolo Bonzini 06831001ac atomics: eliminate mb_read/mb_set
qatomic_mb_read and qatomic_mb_set were the very first atomic primitives
introduced for QEMU; their semantics are unclear and they provide a false
sense of safety.

The last use of qatomic_mb_read() has been removed, so delete it.
qatomic_mb_set() instead can survive as an optimized
qatomic_set()+smp_mb(), similar to Linux's smp_store_mb(), but
rename it to qatomic_set_mb() to match the order of the two
operations.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-06-06 09:42:14 +02:00
Ilya Leoshkevich e7cd7a3916 accel/tcg: Unmap perf_marker
Coverity complains that perf_marker is never unmapped.
Fix by unmapping it in perf_exit().

Fixes: Coverity CID 1507929
Fixes: 5584e2dbe8 ("tcg: add perfmap and jitdump")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230605114134.1169974-1-iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:06:49 -07:00
Richard Henderson bc54ef8c6a plugins: Move plugin_insn_append to translator.c
This function is only used in translator.c, and uses a
target-specific typedef: abi_ptr.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson cac9b0fd08 tcg: Remove target-specific headers from tcg.[ch]
This finally paves the way for tcg/ to be built once per mode.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson 653c46daf2 accel/tcg: Tidy includes for translator.[ch]
Reduce the header to only bswap.h and cpu_ldst.h.
Move exec/translate-all.h to translator.c.
Reduce tcg.h and tcg-op.h to tcg-op-common.h.
Remove otherwise unused headers.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson 309e014dd1 accel/tcg: Move translator_fake_ldb out of line
This is used by exactly one host in extraordinary circumstances.
This means that translator.h need not include plugin-gen.h;
translator.c already includes plugin-gen.h.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson dfd1b81274 accel/tcg: Introduce translator_io_start
New wrapper around gen_io_start which takes care of the USE_ICOUNT
check, as well as marking the DisasContext to end the TB.
Remove exec/gen-icount.h.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson 5623423359 accel/tcg: Move most of gen-icount.h into translator.c
The only usage of gen_tb_start and gen_tb_end are here.
Move the static icount_start_insn variable into a local
within translator_loop.  Simplify the two subroutines
by passing in the existing local cflags variable.

Leave only the declaration of gen_io_start in gen-icount.h.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson 85314e13ad exec-all: Widen TranslationBlock pc and cs_base to 64-bits
This makes TranslationBlock agnostic to the address size of the guest.
Use vaddr for pc, since that's always a virtual address.
Use uint64_t for cs_base, since usage varies between guests.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson 0a18945d03 tcg: Remove NO_CPU_IO_DEFS
From this remove, it's no longer clear what this is attempting
to protect.  The last time a use of this define was added to
the source tree, as opposed to merely moved around, was 2008.
There have been many cleanups since that time and this is
no longer required for the build to succeed.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson 28ea568a03 tcg: Add guest_mo to TCGContext
This replaces of TCG_GUEST_DEFAULT_MO in tcg-op-ldst.c.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson 747bd69d0f tcg: Add insn_start_words to TCGContext
This will enable replacement of TARGET_INSN_START_WORDS in tcg.c.
Split out "tcg/insn-start-words.h" and use it in target/.

Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson c213ee2dfc tcg: Split helper-proto.h
Create helper-proto-common.h without the target specific portion.
Use that in tcg-op-common.h.  Include helper-proto.h in target/arm
and target/hexagon before helper-info.c.inc; all other targets are
already correct in this regard.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson d53106c997 tcg: Pass TCGHelperInfo to tcg_gen_callN
In preparation for compiling tcg/ only once, eliminate
the all_helpers array.  Instantiate the info structs for
the generic helpers in accel/tcg/, and the structs for
the target-specific helpers in each translate.c.

Since we don't see all of the info structs at startup,
initialize at first use, using g_once_init_* to make
sure we don't race while doing so.

Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:29 -07:00
Richard Henderson 70f168f88c tcg: Split out tcg/oversized-guest.h
Move a use of TARGET_LONG_BITS out of tcg/tcg.h.
Include the new file only where required.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:28 -07:00
Richard Henderson e5b4906377 *: Add missing includes of tcg/tcg.h
This had been pulled in from exec/cpu_ldst.h, via exec/exec-all.h,
but the include of tcg.h will be removed.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:28 -07:00
Richard Henderson d0a9bb5ecb tcg: Add tlb_fast_offset to TCGContext
Disconnect the layout of ArchCPU from TCG compilation.
Pass the relative offset of 'env' and 'neg.tlb.f' as a parameter.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:28 -07:00
Richard Henderson 238f43809a tcg: Widen CPUTLBEntry comparators to 64-bits
This makes CPUTLBEntry agnostic to the address size of the guest.
When 32-bit addresses are in effect, we can simply read the low
32 bits of the 64-bit field.  Similarly when we need to update
the field for setting TLB_NOTDIRTY.

For TCG backends that could in theory be big-endian, but in
practice are not (arm, loongarch, riscv), use QEMU_BUILD_BUG_ON
to document and ensure this is not accidentally missed.

For s390x, which is always big-endian, use HOST_BIG_ENDIAN anyway,
to document the reason for the adjustment.

For sparc64 and ppc64, always perform a 64-bit load, and rely on
the following 32-bit comparison to ignore the high bits.

Rearrange mips and ppc if ladders for clarity.

Reviewed-by: Anton Johansson <anjo@rev.ng>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:28 -07:00
Richard Henderson ff0c61bf35 tcg: Move TCG_TYPE_TL from tcg.h to tcg-op.h
Removes the only use of TARGET_LONG_BITS from tcg.h, which is to be
target independent.  Move the symbol to a define in tcg-op.h, which
will continue to be target dependent.  Rather than complicate matters
for the use in tb_gen_code(), expand the definition there.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-05 12:04:28 -07:00
Alex Bennée 367189efae accel/tcg: include cs_base in our hash calculations
We weren't using cs_base in the hash calculations before. Since the
arm front end moved a chunk of flags in a378206a20 (target/arm: Move
mode specific TB flags to tb->cs_base) they comprise of an important
part of the execution state.

Widen the tb_hash_func to include cs_base and expand to qemu_xxhash8()
to accommodate it.

My initial benchmark shows very little difference in the
runtime.

Before:

armhf

➜  hyperfine -w 2 -m 20 "./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot"
Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot
  Time (mean ± σ):     24.627 s ±  2.708 s    [User: 34.309 s, System: 1.797 s]
  Range (min … max):   22.345 s … 29.864 s    20 runs

arm64

➜  hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot"
Benchmark 1: 20
  Time (mean ± σ):     62.559 s ±  2.917 s    [User: 189.115 s, System: 4.089 s]
  Range (min … max):   59.997 s … 70.153 s    10 runs

After:

armhf

Benchmark 1: ./arm-softmmu/qemu-system-arm -cpu cortex-a15 -machine type=virt,highmem=off -display none -m 2048 -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf -device scsi-hd,drive=hd -smp 4 -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' -snapshot
  Time (mean ± σ):     24.223 s ±  2.151 s    [User: 34.284 s, System: 1.906 s]
  Range (min … max):   22.000 s … 28.476 s    20 runs

arm64

hyperfine -w 2 -n 20 "./qemu-system-aarch64 -cpu max,pauth-impdef=on -machine type=virt,virtualization=on,gic-version=3 -display none -serial mon:stdio -netdev user,id=unet,hostfwd=tcp::2222-:22,hostfwd=tcp::1234-:1234 -device virtio-net-pci,netdev=unet -device virtio-scsi-pci -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-arm64 -device scsi-hd,drive=hd -smp 4 -kernel ~/lsrc/linux.git/builds/arm64/arch/arm64/boot/Image.gz -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark-pigz.service' -snapshot"
Benchmark 1: 20
  Time (mean ± σ):     62.769 s ±  1.978 s    [User: 188.431 s, System: 5.269 s]
  Range (min … max):   60.285 s … 66.868 s    10 runs

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20230526165401.574474-12-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-11-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01 11:05:05 -04:00
Alex Bennée d0aaf08bb9 tcg: remove the final vestiges of dstate
Now we no longer have dynamic state affecting things we can remove the
additional fields in cpu.h and simplify the TB hash calculation.

For the benchmark:

    hyperfine -w 2 -m 20 \
      "./arm-softmmu/qemu-system-arm -cpu cortex-a15 \
        -machine type=virt,highmem=off \
        -display none -m 2048 \
        -serial mon:stdio \
        -netdev user,id=unet,hostfwd=tcp::2222-:22 \
        -device virtio-net-pci,netdev=unet \
        -device virtio-scsi-pci \
        -blockdev driver=raw,node-name=hd,discard=unmap,file.driver=host_device,file.filename=/dev/zen-disk/debian-bullseye-armhf \
        -device scsi-hd,drive=hd -smp 4 \
        -kernel /home/alex/lsrc/linux.git/builds/arm/arch/arm/boot/zImage \
        -append 'console=ttyAMA0 root=/dev/sda2 systemd.unit=benchmark.service' \
        -snapshot"

It has a marginal effect on runtime, before:

  Time (mean ± σ):     26.279 s ±  2.438 s    [User: 41.113 s, System: 1.843 s]
  Range (min … max):   24.420 s … 32.565 s    20 runs

after:

  Time (mean ± σ):     24.440 s ±  2.885 s    [User: 34.474 s, System: 2.028 s]
  Range (min … max):   21.663 s … 29.937 s    20 runs

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1358
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20230526165401.574474-10-alex.bennee@linaro.org
Message-Id: <20230524133952.3971948-9-alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-06-01 11:05:05 -04:00
Richard Henderson b3f4144fa9 accel/tcg: Extract store_atom_insert_al16 to host header
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:11 -07:00
Richard Henderson af844a1149 accel/tcg: Extract load_atom_extract_al16_or_al8 to host header
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:11 -07:00
Richard Henderson 9e0e6a7e8e accel/tcg: Fix check for page writeability in load_atomic16_or_exit
PAGE_WRITE is current writability, as modified by TB protection;
PAGE_WRITE_ORG is the original page writability.

Fixes: cdfac37be0 ("accel/tcg: Honor atomicity of loads")
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-30 09:51:11 -07:00
Richard Henderson 645e3a812a tcg: Remove DEBUG_DISAS
This had been set since the beginning, is never undefined,
and it would seem to be harmful to debugging to do so.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson 8dc24ff467 accel/tcg: Correctly use atomic128.h in ldst_atomicity.c.inc
Remove the locally defined load_atomic16 and store_atomic16,
along with HAVE_al16 and HAVE_al16_fast in favor of the
routines defined in atomic128.h.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson 4deb39ebb3 accel/tcg: Eliminate #if on HAVE_ATOMIC128 and HAVE_CMPXCHG128
These symbols will shortly become dynamic runtime tests and
therefore not appropriate for the preprocessor.  Use the
matching CONFIG_* symbols for that purpose.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson 7bedee3243 accel/tcg: Remove prot argument to atomic_mmu_lookup
Now that load/store are gone, we're always passing
PAGE_READ | PAGE_WRITE for RMW atomic operations.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00
Richard Henderson ec4a9629a1 accel/tcg: Remove cpu_atomic_{ld,st}o_*_mmu
Atomic load/store of 128-byte quantities is now handled
by cpu_{ld,st}16_mmu.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-23 18:54:55 -07:00