qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Claudio Fontana	0545608056	cpu: move cc->do_interrupt to tcg_ops Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210204163931.7358-10-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Eduardo Habkost	e9ce43e97a	cpu: Move debug_excp_handler to tcg_ops Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210204163931.7358-8-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Eduardo Habkost	48c1a3e303	cpu: Move cpu_exec_* to tcg_ops Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> [claudio: wrapped target code in CONFIG_TCG] Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20210204163931.7358-6-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Eduardo Habkost	ec62595bab	cpu: Move synchronize_from_tb() to tcg_ops Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> [claudio: wrapped target code in CONFIG_TCG, reworded comments] Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20210204163931.7358-5-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Claudio Fontana	7df5e3d6ad	accel/tcg: split TCG-only code from cpu_exec_realizefn move away TCG-only code, make it compile only on TCG. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [claudio: moved the prototypes from hw/core/cpu.h to exec/cpu-all.h] Signed-off-by: Claudio Fontana <cfontana@suse.de> Message-Id: <20210204163931.7358-4-cfontana@suse.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-02-05 10:24:14 -10:00
Roman Bolshakov	653b87eb36	tcg: Toggle page execution for Apple Silicon Pages can't be both write and executable at the same time on Apple Silicon. macOS provides public API to switch write protection [1] for JIT applications, like TCG. 1. https://developer.apple.com/documentation/apple_silicon/porting_just-in-time_compilers_to_apple_silicon Tested-by: Alexander Graf <agraf@csgraf.de> Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Message-Id: <20210113032806.18220-1-r.bolshakov@yadro.com> [rth: Inline the qemu_thread_jit_* functions; drop the MAP_JIT change for a follow-on patch.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-23 12:13:00 -10:00
Philippe Mathieu-Daudé	c03f041f12	accel/tcg: Restrict tb_gen_code() from other accelerators tb_gen_code() is only called within TCG accelerator, declare it locally. Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20210117164813.4101761-4-f4bug@amsat.org> [rth: Adjust vs changed tb_flush_jmp_cache patch.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-23 12:12:59 -10:00
Douglas Crosher	bfff072c50	tcg: update the cpu running flag in cpu_exec_step_atomic The cpu_exec_step_atomic() function is called with the cpu->running clear and proceeds to run target code without setting this flag. If this target code generates an exception then handle_cpu_signal() will unnecessarily abort. For example if atomic code generates a memory protection fault. This patch at least sets and clears this running flag, and adds some assertions to help detect other cases. Signed-off-by: Douglas Crosher <dtc-ubuntu@scieneer.com> Message-Id: <a272c656-f7c5-019d-1cc0-499b8f80f2fc@scieneer.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-22 12:48:01 -10:00
Richard Henderson	eba40358b4	tcg: Return the TB pointer from the rx region from exit_tb This produces a small pc-relative displacement within the generated code to the TB structure that preceeds it. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:42 -10:00
Richard Henderson	1acbad0f27	tcg: Adjust tb_target_set_jmp_target for split-wx Pass both rx and rw addresses to tb_target_set_jmp_target. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Richard Henderson	db0c51a380	tcg: Introduce tcg_splitwx_to_{rx,rw} Add two helper functions, using a global variable to hold the displacement. The displacement is currently always 0, so no change in behaviour. Begin using the functions in tcg common code only. Reviewed-by: Joelle van Dyne <j@getutm.app> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2021-01-07 05:09:41 -10:00
Daniele Buono	c905a3680d	cfi: Initial support for cfi-icall in QEMU LLVM/Clang, supports runtime checks for forward-edge Control-Flow Integrity (CFI). CFI on indirect function calls (cfi-icall) ensures that, in indirect function calls, the function called is of the right signature for the pointer type defined at compile time. For this check to work, the code must always respect the function signature when using function pointer, the function must be defined at compile time, and be compiled with link-time optimization. This rules out, for example, shared libraries that are dynamically loaded (given that functions are not known at compile time), and code that is dynamically generated at run-time. This patch: 1) Introduces the CONFIG_CFI flag to support cfi in QEMU 2) Introduces a decorator to allow the definition of "sensitive" functions, where a non-instrumented function may be called at runtime through a pointer. The decorator will take care of disabling cfi-icall checks on such functions, when cfi is enabled. 3) Marks functions currently in QEMU that exhibit such behavior, in particular: - The function in TCG that calls pre-compiled TBs - The function in TCI that interprets instructions - Functions in the plugin infrastructures that jump to callbacks - Functions in util that directly call a signal handler Signed-off-by: Daniele Buono <dbuono@linux.vnet.ibm.com> Acked-by: Alex Bennée <alex.bennee@linaro.org Message-Id: <20201204230615.2392-3-dbuono@linux.vnet.ibm.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-02 21:03:35 +01:00
Eduardo Habkost	710384d042	tcg: Make CPUClass.debug_excp_handler optional Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20201212155530.23098-12-cfontana@suse.de> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-16 15:50:33 -05:00
Eduardo Habkost	80c4750ba8	tcg: make CPUClass.cpu_exec_* optional This will let us simplify the code that initializes CPU class methods, when we move cpu_exec_*() to a separate struct. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20201212155530.23098-11-cfontana@suse.de> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-16 15:50:33 -05:00
Eduardo Habkost	035ba06c2e	tcg: cpu_exec_{enter,exit} helpers Move invocation of CPUClass.cpu_exec_*() to separate helpers, to make it easier to refactor that code later. Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20201212155530.23098-10-cfontana@suse.de> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2020-12-16 15:50:33 -05:00
Philippe Mathieu-Daudé	19a84318c6	accel/tcg: Remove special case for GCC < 4.6 Since commit `efc6c070ac` ("configure: Add a test for the minimum compiler version") the minimum compiler version required for GCC is 4.8. We can safely remove the special case for GCC 4.6 introduced in commit `0448f5f8b8` ("cpu-exec: Fix compiler warning (-Werror=clobbered)"). No change for Clang as we don't know. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20201210134752.780923-3-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-12-15 12:52:08 -05:00
Pavel Dovgalyuk	835cbd8d44	icount: improve exec nocache usage cpu-exec tries to execute TB without caching when current icount budget is over. But sometimes refilled budget is big enough to try executing cached blocks. This patch checks that instruction budget is big enough for next block execution instead of just running cpu_exec_nocache. It halves the number of calls of cpu_exec_nocache function during tested OS boot scenario. Signed-off-by: Pavel Dovgalyuk <pavel.dovgalyuk@ispras.ru> Message-Id: <160741865825.348476.7169239332367828943.stgit@pasha-ThinkPad-X280> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-12-15 12:52:04 -05:00
Pavel Dovgalyuk	4084893ddc	replay: don't record interrupt poll Interrupt poll is not a real interrupt event. It is needed only for thread safety. This interrupt is used for i386 and converted to hardware interrupt by cpu_handle_interrupt function. Therefore it is not needed to be recorded, because hardware interrupt will be recorded after converting. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> -- v4 changes: - Condition check refactoring (suggested by Alex Bennée) Message-Id: <160174517124.12451.12983410242461131737.stgit@pasha-ThinkPad-X280> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-10-06 08:34:49 +02:00
Claudio Fontana	8191d36841	icount: rename functions to be consistent with the module name Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-10-05 16:41:22 +02:00
Claudio Fontana	740b175973	cpu-timers, icount: new modules refactoring of cpus.c continues with cpu timer state extraction. cpu-timers: responsible for the softmmu cpu timers state, including cpu clocks and ticks. icount: counts the TCG instructions executed. As such it is specific to the TCG accelerator. Therefore, it is built only under CONFIG_TCG. One complication is due to qtest, which uses an icount field to warp time as part of qtest (qtest_clock_warp). In order to solve this problem, provide a separate counter for qtest. This requires fixing assumptions scattered in the code that qtest_enabled() implies icount_enabled(), checking each specific case. Signed-off-by: Claudio Fontana <cfontana@suse.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [remove redundant initialization with qemu_spice_init] Reviewed-by: Alex Bennée <alex.bennee@linaro.org> [fix lingering calls to icount_get] Signed-off-by: Claudio Fontana <cfontana@suse.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2020-10-05 16:41:22 +02:00
Stefan Hajnoczi	d73415a315	qemu/atomic.h: rename atomic_ to qatomic_ clang's C11 atomic_fetch_() functions only take a C11 atomic type pointer argument. QEMU uses direct types (int, etc) and this causes a compiler error when a QEMU code calls these functions in a source file that also included <stdatomic.h> via a system header file: $ CC=clang CXX=clang++ ./configure ... && make ../util/async.c:79:17: error: address argument to atomic operation must be a pointer to _Atomic type ('unsigned int ' invalid) Avoid using atomic_*() names in QEMU's atomic.h since that namespace is used by <stdatomic.h>. Prefix QEMU's APIs with 'q' so that atomic.h and <stdatomic.h> can co-exist. I checked /usr/include on my machine and searched GitHub for existing "qatomic_" users but there seem to be none. This patch was generated using: $ git grep -h -o '\<atomic$64$\?_[a-z0-9_]\+' include/qemu/atomic.h \| \ sort -u >/tmp/changed_identifiers $ for identifier in $(</tmp/changed_identifiers); do sed -i "s%\<$identifier\>%q$identifier%g" \ $(git grep -I -l "\<$identifier\>") done I manually fixed line-wrap issues and misaligned rST tables. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20200923105646.47864-1-stefanha@redhat.com>	2020-09-23 16:07:44 +01:00
Richard Henderson	ba3c35d9c4	tcg/cpu-exec: precise single-stepping after an interrupt When single-stepping with a debugger attached to QEMU, and when an interrupt is raised, the debugger misses the first instruction after the interrupt. Tested-by: Luc Michel <luc.michel@greensocs.com> Reviewed-by: Luc Michel <luc.michel@greensocs.com> Buglink: https://bugs.launchpad.net/qemu/+bug/757702 Message-Id: <20200717163029.2737546-1-richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-07-17 11:09:34 -07:00
Luc Michel	a7ba744f40	tcg/cpu-exec: precise single-stepping after an exception When single-stepping with a debugger attached to QEMU, and when an exception is raised, the debugger misses the first instruction after the exception: $ qemu-system-aarch64 -M virt -display none -cpu cortex-a53 -s -S $ aarch64-linux-gnu-gdb GNU gdb (GDB) 9.2 [...] (gdb) tar rem :1234 Remote debugging using :1234 warning: No executable has been specified and target does not support determining executable automatically. Try using the "file" command. 0x0000000000000000 in ?? () (gdb) # writing nop insns to 0x200 and 0x204 (gdb) set 0x200 = 0xd503201f (gdb) set 0x204 = 0xd503201f (gdb) # 0x0 address contains 0 which is an invalid opcode. (gdb) # The CPU should raise an exception and jump to 0x200 (gdb) si 0x0000000000000204 in ?? () With this commit, the same run steps correctly on the first instruction of the exception vector: (gdb) si 0x0000000000000200 in ?? () Buglink: https://bugs.launchpad.net/qemu/+bug/757702 Signed-off-by: Luc Michel <luc.michel@greensocs.com> Message-Id: <20200716193947.3058389-1-luc.michel@greensocs.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-07-16 14:08:29 -07:00
Alex Bennée	886cc68943	accel/tcg: fix race in cpu_exec_step_atomic (bug 1863025) The bug describes a race whereby cpu_exec_step_atomic can acquire a TB which is invalidated by a tb_flush before we execute it. This doesn't affect the other cpu_exec modes as a tb_flush by it's nature can only occur on a quiescent system. The race was described as: B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code B3. tcg_tb_alloc obtains a new TB C3. TB obtained with tb_lookup__cpu_state or tb_gen_code (same TB as B2) A3. start_exclusive critical section entered A4. do_tb_flush is called, TB memory freed/re-allocated A5. end_exclusive exits critical section B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code B3. tcg_tb_alloc reallocates TB from B2 C4. start_exclusive critical section entered C5. cpu_tb_exec executes the TB code that was free in A4 The simplest fix is to widen the exclusive period to include the TB lookup. As a result we can drop the complication of checking we are in the exclusive region before we end it. Cc: Yifan <me@yifanlu.com> Buglink: https://bugs.launchpad.net/qemu/+bug/1863025 Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20200214144952.15502-1-alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-02-28 10:58:41 -08:00
Philippe Mathieu-Daudé	dcb32f1d8f	tcg: Search includes from the project root source directory We currently search both the root and the tcg/ directories for tcg files: $ git grep '#include "tcg/' \| wc -l 28 $ git grep '#include "tcg[^/]' \| wc -l 94 To simplify the preprocessor search path, unify by expliciting the tcg/ directory. Patch created mechanically by running: $ for x in \ tcg.h tcg-mo.h tcg-op.h tcg-opc.h \ tcg-op-gvec.h tcg-gvec-desc.h; do \ sed -i "s,#include \"$x\",#include \"tcg/$x\"," \ $(git grep -l "#include \"$x\""); \ done Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts) Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20200101112303.20724-2-philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2020-01-15 15:13:10 -10:00
Robert Foley	fc59d2d870	qemu_log_lock/unlock now preserves the qemu_logfile handle. qemu_log_lock() now returns a handle and qemu_log_unlock() receives a handle to unlock. This allows for changing the handle during logging and ensures the lock() and unlock() are for the same file. Also in target/tilegx/translate.c removed the qemu_log_lock()/unlock() calls (and the log("\n")), since the translator can longjmp out of the loop if it attempts to translate an instruction in an inaccessible page. Signed-off-by: Robert Foley <robert.foley@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Message-Id: <20191118211528.3221-5-robert.foley@linaro.org>	2019-12-18 20:18:02 +00:00
Emilio G. Cota	e6d86bed50	tcg: let plugins instrument virtual memory accesses To capture all memory accesses we need hook into all the various helper functions that are involved in memory operations as well as the injected inline helper calls. A later commit will allow us to resolve the actual guest HW addresses by replaying the lookup. Signed-off-by: Emilio G. Cota <cota@braap.org> [AJB: drop haddr handling, just deal in vaddr] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 15:12:38 +00:00
Emilio G. Cota	cfbc3c6083	cpu: introduce cpu_in_exclusive_context() Suggested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> [AJB: moved inside start/end_exclusive fns + cleanup] Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2019-10-28 15:12:38 +00:00
Pavel Dovgalyuk	ba3e792669	icount: clean up cpu_can_io at the entry to the block Most of IO instructions can be executed only at the end of the block in icount mode. Therefore translator can set cpu_can_io flag when translating the last instruction. But when the blocks are chained, then this flag is not reset and may remain set at the beginning of the next block. This patch resets the flag at the entry of any translation block, making I/O operations impossible by default. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> -- v2 changes: - reset can_do_io at the start of every TB (suggested by Paolo Bonzini) Message-Id: <156404428943.18669.15747009371169578935.stgit@pasha-Precision-3630-Tower> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-08-20 17:26:22 +02:00
Markus Armbruster	a8d2532645	Include qemu-common.h exactly where needed No header includes qemu-common.h after this commit, as prescribed by qemu-common.h's file comment. Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20190523143508.25387-5-armbru@redhat.com> [Rebased with conflicts resolved automatically, except for include/hw/arm/xlnx-zynqmp.h hw/arm/nrf51_soc.c hw/arm/msf2-soc.c block/qcow2-refcount.c block/qcow2-cluster.c block/qcow2-cache.c target/arm/cpu.h target/lm32/cpu.h target/m68k/cpu.h target/mips/cpu.h target/moxie/cpu.h target/nios2/cpu.h target/openrisc/cpu.h target/riscv/cpu.h target/tilegx/cpu.h target/tricore/cpu.h target/unicore32/cpu.h target/xtensa/cpu.h; bsd-user/main.c and net/tap-bsd.c fixed up]	2019-06-12 13:20:20 +02:00
Richard Henderson	5e1401969b	cpu: Move icount_decr to CPUNegativeOffsetState Amusingly, we had already ignored the comment to keep this value at the end of CPUState. This restores the minimum negative offset from TCG_AREG0 for code generation. For the couple of uses within qom/cpu.c, without NEED_CPU_H, add a pointer from the CPUState object to the IcountDecr object within CPUNegativeOffsetState. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:42 -07:00
Peter Maydell	713acc316d	Queued accel/tcg patches -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcWle8AAoJEGTfOOivfiFfYeAIAKAu8KlM6JVHVITkk58lzRmT /cwd2O0yRroROslxrdAC0IRgMAH/oSIRK2W3dAIXLXb2TNuf3ydvdkBs6oVGWYnd 5P6fdFSq4ZqOGY6Y2QbBe1RhIOnmYOVreVePQsT+ofIFZOYrP0hKtcc68nB8nGZM vVdlfotU/6aQuPZnXVBPNsXZCZEWz67JY1pzJTTNJKtPV51jsgxNSp4ktXuqkmvY RnnG347hbQWFs5bCaVXoSp5rzuDxfjlI5/vJw/HuG3h47LSWkiq1GYg39d5Yn802 N7nZURXEI6onmLkTVndV/EkA1yqvUpOh4NM7S1hZVJ2DXu/wXcEr8kAATSeYkos= =Gmjb -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20190206' into staging Queued accel/tcg patches # gpg: Signature made Wed 06 Feb 2019 03:42:52 GMT # gpg: using RSA key 64DF38E8AF7E215F # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full] # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-tcg-20190206: accel/tcg: Consider cluster index in tb_lookup__cpu_state() tcg: add early clober modifier in atomic16_cmpxchg on aarch64 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-02-07 11:46:40 +00:00
Peter Maydell	9fd9b7de61	accel/tcg: Consider cluster index in tb_lookup__cpu_state() In commit `f7b78602fd` we added the CPU cluster number to the cflags field of the TB hash; this included adding it to the value kept in tb->cflags, since we pass that field directly into the hash calculation in some places. Unfortunately we forgot to check whether other parts of the code were doing comparisons against tb->cflags that would need to be updated. It turns out that there is exactly one such place: the tb_lookup__cpu_state() function checks whether the TB it has found in the tb_jmp_cache has a tb->cflags matching the cf_mask that is passed in. The tb->cflags has the cluster_index in it but the cf_mask does not. Hoist the "add cluster index to the cf_mask" code up from tb_htable_lookup() to tb_lookup__cpu_state() so it can be considered in the "did this TB match in the jmp cache" condition, as well as when we do the full hash lookup by physical PC, flags, etc. (tb_htable_lookup() is only called from tb_lookup__cpu_state(), so this change doesn't require any further knock-on changes.) Fixes: `f7b78602fd` ("accel/tcg: Add cluster number to TCG TB hash") Tested-by: Cleber Rosa <crosa@redhat.com> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reported-by: Howard Spoelstra <hsp.cat7@gmail.com> Reported-by: Cleber Rosa <crosa@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20190205151810.571-1-peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-02-06 03:39:24 +00:00
Emilio G. Cota	6aaa24f9d4	cpu-exec: reset BQL after longjmp in cpu_exec_step_atomic Just like we do in cpu_exec(). Reported-by: Max Filippov <jcmvbkbc@gmail.com> Tested-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-02-05 16:50:16 +01:00
Emilio G. Cota	8fd3a9b81d	cpu-exec: add assert_no_pages_locked() after longjmp We forgot to add this check in `faa9372c07` ("translate-all: introduce assert_no_pages_locked", 2018-06-15); we only added it after returning from a longjmp in cpu_exec_step_atomic. Fix it. Signed-off-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2019-02-05 16:50:16 +01:00
Thomas Huth	fb0343d5b4	tcg: Fix LGPL version number It's either "GNU Library General Public version 2" or "GNU Lesser General Public version 2.1", but there was no "version 2.0" of the "Lesser" library. So assume that version 2.1 is meant here. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1548252536-6242-5-git-send-email-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-01-30 11:01:52 +01:00
Peter Maydell	f7b78602fd	accel/tcg: Add cluster number to TCG TB hash Include the cluster number in the hash we use to look up TBs. This is important because a TB that is valid for one cluster at a given physical address and set of CPU flags is not necessarily valid for another: the two clusters may have different views of physical memory, or may have different CPU features (eg FPU present or absent). We put the cluster number in the high 8 bits of the TB cflags. This gives us up to 256 clusters, which should be enough for anybody. If we ever need more, or need more bits in cflags for other purposes, we could make tb_hash_func() take more data (and expand qemu_xxhash7() to qemu_xxhash8()). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Message-id: 20190121152218.9592-4-peter.maydell@linaro.org	2019-01-29 11:46:06 +00:00
Richard Henderson	d7f425fdea	tcg: Implement CPU_LOG_TB_NOCHAIN during expansion Rather than test NOCHAIN before linking, do not emit the goto_tb opcode at all. We already do this for goto_ptr. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 18:58:10 -07:00
Peter Maydell	7252f2dea9	accel/tcg: Handle get_page_addr_code() returning -1 in hashtable lookups When we support execution from non-RAM MMIO regions, get_page_addr_code() will return -1 to indicate that there is no RAM at the requested address. Handle this in the cpu-exec TB hashtable lookup code, treating it as "no match found". Note that the call to get_page_addr_code() in tb_lookup_cmp() needs no changes -- a return of -1 will already correctly result in the function returning false. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Tested-by: Cédric Le Goater <clg@kaod.org> Message-id: 20180710160013.26559-3-peter.maydell@linaro.org	2018-08-14 17:17:19 +01:00
Emilio G. Cota	0ac20318ce	tcg: remove tb_lock Use mmap_lock in user-mode to protect TCG state and the page descriptors. In !user-mode, each vCPU has its own TCG state, so no locks needed. Per-page locks are used to protect the page descriptors. Per-TB locks are used in both modes to protect TB jumps. Some notes: - tb_lock is removed from notdirty_mem_write by passing a locked page_collection to tb_invalidate_phys_page_fast. - tcg_tb_lookup/remove/insert/etc have their own internal lock(s), so there is no need to further serialize access to them. - do_tb_flush is run in a safe async context, meaning no other vCPU threads are running. Therefore acquiring mmap_lock there is just to please tools such as thread sanitizer. - Not visible in the diff, but tb_invalidate_phys_page already has an assert_memory_lock. - cpu_io_recompile is !user-only, so no mmap_lock there. - Added mmap_unlock()'s before all siglongjmp's that could be called in user-mode while mmap_lock is held. + Added an assert for !have_mmap_lock() after returning from the longjmp in cpu_exec, just like we do in cpu_exec_step_atomic. Performance numbers before/after: Host: AMD Opteron(tm) Processor 6376 ubuntu 17.04 ppc64 bootup+shutdown time 700 +-+--+----+------+------------+-----------+--------------+-+ \| + + + + + B \| \| before *B* ** * \| \|tb lock removal ###D### * \| 600 +-+ * +-+ \| ** # \| \| B #D \| \| *** * ## \| 500 +-+ *** ### +-+ \| * *** ### \| \| B # ## \| \| ** * #D# \| 400 +-+ ## +-+ \| ### \| \| ## \| \| # ## \| 300 +-+ * B* #D# +-+ \| B *** ### \| \| * ** #### \| \| * *** ### \| 200 +-+ B B #D# +-+ \| #B * ## # \| \| #* ## \| \| + D##D# + + + + \| 100 +-+--+----+------+------------+-----------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/HwmBHXe debian jessie aarch64 bootup+shutdown time 90 +-+--+-----+-----+------------+------------+------------+--+-+ \| + + + + + + \| \| before *B* B \| 80 +tb lock removal ###D### D +-+ \| ### \| \| ## \| 70 +-+ # +-+ \| ## \| \| # \| 60 +-+ B ## +-+ \| * ## \| \| * #D \| 50 +-+ * ## +-+ \| * ### \| \| B* ### \| 40 +-+ ** # ## +-+ \| #D# \| \| B* ### \| 30 +-+ B*B #### +-+ \| B * * # ### \| \| B ###D# \| 20 +-+ D ##D## +-+ \| D# \| \| + + + + + + \| 10 +-+--+-----+-----+------------+------------+------------+--+-+ 1 8 16 Guest CPUs 48 64 png: https://imgur.com/iGpGFtv The gains are high for 4-8 CPUs. Beyond that point, however, unrelated lock contention significantly hurts scalability. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:48 -10:00
Emilio G. Cota	194125e3eb	translate-all: protect TB jumps with a per-destination-TB lock This applies to both user-mode and !user-mode emulation. Instead of relying on a global lock, protect the list of incoming jumps with tb->jmp_lock. This lock also protects tb->cflags, so update all tb->cflags readers outside tb->jmp_lock to use atomic reads via tb_cflags(). In order to find the destination TB (and therefore its jmp_lock) from the origin TB, we introduce tb->jmp_dest[]. I considered not using a linked list of jumps, which simplifies code and makes the struct smaller. However, it unnecessarily increases memory usage, which results in a performance decrease. See for instance these numbers booting+shutting down debian-arm: Time (s) Rel. err (%) Abs. err (s) Rel. slowdown (%) ------------------------------------------------------------------------------ before 20.88 0.74 0.154512 0. after 20.81 0.38 0.079078 -0.33524904 GTree 21.02 0.28 0.058856 0.67049808 GHashTable + xxhash 21.63 1.08 0.233604 3.5919540 Using a hash table or a binary tree to keep track of the jumps doesn't really pay off, not only due to the increased memory usage, but also because most TBs have only 0 or 1 jumps to them. The maximum number of jumps when booting debian-arm that I measured is 35, but as we can see in the histogram below a TB with that many incoming jumps is extremely rare; the average TB has 0.80 incoming jumps. n_jumps: 379208; avg jumps/tb: 0.801099 dist: [0.0,1.0)\|▄█▁▁▁▁▁▁▁▁▁▁▁ ▁▁▁▁▁▁ ▁▁▁ ▁▁▁ ▁\|[34.0,35.0] Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:48 -10:00
Emilio G. Cota	95590e24af	translate-all: discard TB when tb_link_page returns an existing matching TB Use the recently-gained QHT feature of returning the matching TB if it already exists. This allows us to get rid of the lookup we perform right after acquiring tb_lock. Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:42 -10:00
Emilio G. Cota	faa9372c07	translate-all: introduce assert_no_pages_locked The appended adds assertions to make sure we do not longjmp with page locks held. Note that user-mode has nothing to check, since page_locks are !user-mode only. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	be2cdc5e35	tcg: track TBs with per-region BST's This paves the way for enabling scalable parallel generation of TCG code. Instead of tracking TBs with a single binary search tree (BST), use a BST for each TCG region, protecting it with a lock. This is as scalable as it gets, since each TCG thread operates on a separate region. The core of this change is the introduction of struct tcg_region_tree, which contains a pointer to a GTree and an associated lock to serialize accesses to it. We then allocate an array of tcg_region_tree's, adding the appropriate padding to avoid false sharing based on qemu_dcache_linesize. Given a tc_ptr, we first find the corresponding region_tree. This is done by special-casing the first and last regions first, since they might be of size != region.size; otherwise we just divide the offset by region.stride. I was worried about this division (several dozen cycles of latency), but profiling shows that this is not a fast path. Note that region.stride is not required to be a power of two; it is only required to be a multiple of the host's page size. Note that with this design we can also provide consistent snapshots about all region trees at once; for instance, tcg_tb_foreach acquires/releases all region_tree locks before/after iterating over them. For this reason we now drop tb_lock in dump_exec_info(). As an alternative I considered implementing a concurrent BST, but this can be tricky to get right, offers no consistent snapshots of the BST, and performance and scalability-wise I don't think it could ever beat having separate GTrees, given that our workload is insert-mostly (all concurrent BST designs I've seen focus, understandably, on making lookups fast, which comes at the expense of convoluted, non-wait-free insertions/removals). Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Emilio G. Cota	61b8cef1d4	qht: require a default comparison function qht_lookup now uses the default cmp function. qht_lookup_custom is defined to retain the old behaviour, that is a cmp function is explicitly provided. qht_insert will gain use of the default cmp in the next patch. Note that we move qht_lookup_custom's @func to be the last argument, which makes the new qht_lookup as simple as possible. Instead of this (i.e. keeping @func 2nd): 0000000000010750 <qht_lookup>: 10750: 89 d1 mov %edx,%ecx 10752: 48 89 f2 mov %rsi,%rdx 10755: 48 8b 77 08 mov 0x8(%rdi),%rsi 10759: e9 22 ff ff ff jmpq 10680 <qht_lookup_custom> 1075e: 66 90 xchg %ax,%ax We get: 0000000000010740 <qht_lookup>: 10740: 48 8b 4f 08 mov 0x8(%rdi),%rcx 10744: e9 37 ff ff ff jmpq 10680 <qht_lookup_custom> 10749: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 07:42:55 -10:00
Philippe Mathieu-Daudé	df924c0643	accel: Do not include "exec/address-spaces.h" if it is not necessary Code change produced with: $ git grep '#include "exec/address-spaces.h"' accel \| \ cut -d: -f-1 \| \ xargs egrep -L "(get_system_\|address_space_)" \| \ xargs sed -i.bak '/#include "exec\/address-spaces.h"/d' Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-Id: <20180528232719.4721-3-f4bug@amsat.org> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-05-31 19:12:13 +02:00
Peter Maydell	ae76518047	tcg: Optionally log FPU state in TCG -d cpu logging Usually the logging of the CPU state produced by -d cpu is sufficient to diagnose problems, but sometimes you want to see the state of the floating point registers as well. We don't want to enable that by default as it adds a lot of extra data to the log; instead, allow it to be optionally enabled via -d fpu. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180510130024.31678-1-peter.maydell@linaro.org	2018-05-15 14:58:44 +01:00
Pavel Dovgalyuk	afd46fcad2	icount: fix cpu_restore_state_from_tb for non-tb-exit cases In icount mode, instructions that access io memory spaces in the middle of the translation block invoke TB recompilation. After recompilation, such instructions become last in the TB and are allowed to access io memory spaces. When the code includes instruction like i386 'xchg eax, 0xffffd080' which accesses APIC, QEMU goes into an infinite loop of the recompilation. This instruction includes two memory accesses - one read and one write. After the first access, APIC calls cpu_report_tpr_access, which restores the CPU state to get the current eip. But cpu_restore_state_from_tb resets the cpu->can_do_io flag which makes the second memory access invalid. Therefore the second memory access causes a recompilation of the block. Then these operations repeat again and again. This patch moves resetting cpu->can_do_io flag from cpu_restore_state_from_tb to cpu_loop_exit* functions. It also adds a parameter for cpu_restore_state which controls restoring icount. There is no need to restore icount when we only query CPU state without breaking the TB. Restoring it in such cases leads to the incorrect flow of the virtual time. In most cases new parameter is true (icount should be recalculated). But there are two cases in i386 and openrisc when the CPU state is only queried without the need to break the TB. This patch fixes both of these cases. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Message-Id: <20180409091320.12504.35329.stgit@pasha-VirtualBox> [rth: Make can_do_io setting unconditional; move from cpu_exec; make cpu_loop_exit_{noexc,restore} call cpu_loop_exit.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-04-11 09:05:22 +10:00
Pavel Dovgalyuk	5f3bdfd4fa	cpu-exec: fix exception_index handling Function cpu_handle_interrupt calls cc->cpu_exec_interrupt to process pending hardware interrupts. Under the hood cpu_exec_interrupt uses cpu->exception_index to pass information to the internal function which is usually common for exception and interrupt processing. But this value is not reset after return and may be processed again by cpu_handle_exception. This does not happen due to overwriting the exception_index at the end of cpu_handle_interrupt. But this branch may also overwrite the valid exception_index in some cases. Therefore this patch: 1. resets exception_index just after the call to cpu_exec_interrupt 2. prevents overwriting the meaningful value of exception_index Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20180227095140.1060.61357.stgit@pasha-VirtualBox> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>	2018-03-12 16:12:50 +01:00
Paolo Bonzini	4fad446bc9	tcg: add cs_base and flags to -d exec output Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Message-Id: <20171217055023.29225-1-pbonzini@redhat.com> [rth: Also change the Chain logging in helper_lookup_tb_ptr.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-12-29 12:43:40 -08:00

1 2

68 Commits