qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Richard Henderson	29a0af618d	cpu: Replace ENV_GET_CPU with env_cpu Now that we have both ArchCPU and CPUArchState, we can define this generically instead of via macro in each target's cpu.h. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:34 -07:00
Richard Henderson	a40ec84ee2	tcg: Create struct CPUTLB Move all softmmu tlb data into this structure. Arrange the members so that we are able to place mask+table together and at a smaller absolute offset from ENV. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Acked-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:34 -07:00
Richard Henderson	79e4208506	tcg: Fold CPUTLBWindow into CPUTLBDesc Both structures are allocated once per mmu_idx. There is no reason for them to be separate. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-06-10 07:03:34 -07:00
Peter Maydell	d8276573da	Add CPUClass::tlb_fill. Improve tlb_vaddr_to_host for use by ARM SVE no-fault loads. -----BEGIN PGP SIGNATURE----- iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAlzVx4UdHHJpY2hhcmQu aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV+U1Af/b3cV5d5a1LWRdLgR 71JCPK/M3o43r2U9wCSikteXkmNBEdEoc5+WRk2SuZFLW/JB1DHDY7/gISPIhfoB ZIza2TxD/QK1CQ5/mMWruKBlyygbYYZgsYaaNsMJRJgicgOSjTN0nuHMbIfv3tAN mu+IlkD0LdhVjP0fz30Jpew3b3575RCjYxEPM6KQI3RxtQFjZ3FhqV5hKR4vtdP5 yLWJQzwAbaCB3SZUvvp7TN1ZsmeyLpc+Yz/YtRTqQedo7SNWWBKldLhqq4bZnH1I AkzHbtWIOBrjWJ34ZMAgI5Q56Du9TBbBvCdM9azmrQjSu/2kdsPBPcUyOpnUCsCx NyXo9g== =x71l -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20190510' into staging Add CPUClass::tlb_fill. Improve tlb_vaddr_to_host for use by ARM SVE no-fault loads. # gpg: Signature made Fri 10 May 2019 19:48:37 BST # gpg: using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F # gpg: issuer "richard.henderson@linaro.org" # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full] # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-tcg-20190510: (27 commits) tcg: Use tlb_fill probe from tlb_vaddr_to_host tcg: Remove CPUClass::handle_mmu_fault tcg: Use CPUClass::tlb_fill in cputlb.c target/xtensa: Convert to CPUClass::tlb_fill target/unicore32: Convert to CPUClass::tlb_fill target/tricore: Convert to CPUClass::tlb_fill target/tilegx: Convert to CPUClass::tlb_fill target/sparc: Convert to CPUClass::tlb_fill target/sh4: Convert to CPUClass::tlb_fill target/s390x: Convert to CPUClass::tlb_fill target/riscv: Convert to CPUClass::tlb_fill target/ppc: Convert to CPUClass::tlb_fill target/openrisc: Convert to CPUClass::tlb_fill target/nios2: Convert to CPUClass::tlb_fill target/moxie: Convert to CPUClass::tlb_fill target/mips: Convert to CPUClass::tlb_fill target/mips: Tidy control flow in mips_cpu_handle_mmu_fault target/mips: Pass a valid error to raise_mmu_exception for user-only target/microblaze: Convert to CPUClass::tlb_fill target/m68k: Convert to CPUClass::tlb_fill ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-05-16 13:15:08 +01:00
Richard Henderson	4601f8d10d	cputlb: Do unaligned store recursion to outermost function This is less tricky than for loads, because we always fall back to single byte stores to implement unaligned stores. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>	2019-05-10 20:23:21 +01:00
Richard Henderson	2dd9260678	cputlb: Do unaligned load recursion to outermost function If we attempt to recurse from load_helper back to load_helper, even via intermediary, we do not get all of the constants expanded away as desired. But if we recurse back to the original helper (or a shim that has a consistent function signature), the operands are folded away as desired. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>	2019-05-10 20:23:21 +01:00
Richard Henderson	fc1bc77791	cputlb: Drop attribute flatten Going to approach this problem via __attribute__((always_inline)) instead, but full conversion will take several steps. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>	2019-05-10 20:23:21 +01:00
Richard Henderson	f1be36969d	cputlb: Move TLB_RECHECK handling into load/store_helper Having this in io_readx/io_writex meant that we forgot to re-compute index after tlb_fill. It also means we can use the normal aligned memory load path. It also fixes a bug in that we had cached a use of index across a tlb_fill. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>	2019-05-10 20:23:21 +01:00
Alex Bennée	eed5664238	accel/tcg: demacro cputlb Instead of expanding a series of macros to generate the load/store helpers we move stuff into common functions and rely on the compiler to eliminate the dead code for each variant. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>	2019-05-10 20:23:06 +01:00
Richard Henderson	4811e9095c	tcg: Use tlb_fill probe from tlb_vaddr_to_host Most of the existing users would continue around a loop which would fault the tlb entry in via a normal load/store. But for AArch64 SVE we have an existing emulation bug wherein we would mark the first element of a no-fault vector load as faulted (within the FFR, not via exception) just because we did not have its address in the TLB. Now we can properly only mark it as faulted if there really is no valid, readable translation, while still not raising an exception. (Note that beyond the first element of the vector, the hardware may report a fault for any reason whatsoever; with at least one element loaded, forward progress is guaranteed.) Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-10 11:12:50 -07:00
Richard Henderson	c319dc1357	tcg: Use CPUClass::tlb_fill in cputlb.c We can now use the CPUClass hook instead of a named function. Create a static tlb_fill function to avoid other changes within cputlb.c. This also isolates the asserts within. Remove the named tlb_fill function from all of the targets. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-05-10 11:12:50 -07:00
Shahab Vahedi	ef5dae6805	cputlb: Fix io_readx() to respect the access_type This change adapts io_readx() to its input access_type. Currently io_readx() treats any memory access as a read, although it has an input argument "MMUAccessType access_type". This results in: 1) Calling the tlb_fill() only with MMU_DATA_LOAD 2) Considering only entry->addr_read as the tlb_addr Buglink: https://bugs.launchpad.net/qemu/+bug/1825359 Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Shahab Vahedi <shahab.vahedi@gmail.com> Message-Id: <20190420072236.12347-1-shahab.vahedi@gmail.com> [rth: Remove assert; fix expression formatting.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-04-25 10:40:06 -07:00
Emilio G. Cota	6d967cb86d	cputlb: update TLB entry/index after tlb_fill We are failing to take into account that tlb_fill() can cause a TLB resize, which renders prior TLB entry pointers/indices stale. Fix it by re-doing the TLB entry lookups immediately after tlb_fill. Fixes: `86e1eff8bc` ("tcg: introduce dynamic TLB sizing", 2019-01-28) Reported-by: Max Filippov <jcmvbkbc@gmail.com> Tested-by: Max Filippov <jcmvbkbc@gmail.com> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190209162745.12668-3-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-02-11 08:52:44 -08:00
Thomas Huth	fb0343d5b4	tcg: Fix LGPL version number It's either "GNU Library General Public version 2" or "GNU Lesser General Public version 2.1", but there was no "version 2.0" of the "Lesser" library. So assume that version 2.1 is meant here. Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <1548252536-6242-5-git-send-email-thuth@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-01-30 11:01:52 +01:00
Richard Henderson	e77c89fb08	cputlb: Remove static tlb sizing Now that all tcg backends support TCG_TARGET_IMPLEMENTS_DYN_TLB, remove the define and the old code. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:35 -08:00
Emilio G. Cota	86e1eff8bc	tcg: introduce dynamic TLB sizing Disabled in all TCG backends for now. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190116170114.26802-3-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Emilio G. Cota	3cea94bbc9	cputlb: do not evict empty entries to the vtlb Currently we evict an entry to the victim TLB when it doesn't match the current address. But it could be that there's no match because the current entry is empty (i.e. all -1's, for instance via tlb_flush). Do not evict the entry to the vtlb in that case. This change will help us keep track of the TLB's use rate, which we'll use to implement a policy for dynamic TLB sizing. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190116170114.26802-2-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	ab65110530	cputlb: Remove tlb_c.pending_flushes This is essentially redundant with tlb_c.dirty. Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-31 12:16:39 +00:00
Richard Henderson	3d1523ced6	cputlb: Filter flushes on already clean tlbs Especially for guests with large numbers of tlbs, like ARM or PPC, we may well not use all of them in between flush operations. Remember which tlbs have been used since the last flush, and avoid any useless flushing. Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-31 12:16:35 +00:00
Richard Henderson	e09de0a20d	cputlb: Count "partial" and "elided" tlb flushes Our only statistic so far was "full" tlb flushes, where all mmu_idx are flushed at the same time. Now count "partial" tlb flushes where sets of mmu_idx are flushed, but the set is not maximal. Account one per mmu_idx flushed, as that is the unit of work performed. We don't actually count elided flushes yet, but go ahead and change the interface presented to the monitor all at once. Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-31 12:16:30 +00:00
Richard Henderson	f8144c6c1e	cputlb: Merge tlb_flush_page into tlb_flush_page_by_mmuidx The difference between the two sets of APIs is now miniscule. Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-31 12:16:26 +00:00
Richard Henderson	64f2674bbc	cputlb: Merge tlb_flush_nocheck into tlb_flush_by_mmuidx_async_work The difference between the two sets of APIs is now miniscule. This allows tlb_flush, tlb_flush_all_cpus, and tlb_flush_all_cpus_synced to be merged with their corresponding by_mmuidx functions as well. For accounting, consider mmu_idx_bitmask = ALL_MMUIDX_BITS to be a full flush. Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-31 12:16:16 +00:00
Richard Henderson	d5363e5849	cputlb: Move env->vtlb_index to env->tlb_d.vindex The rest of the tlb victim cache is per-tlb, the next use index should be as well. Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-31 12:16:12 +00:00
Richard Henderson	1308e02671	cputlb: Split large page tracking per mmu_idx The set of large pages in the kernel is probably not the same as the set of large pages in the application. Forcing one range to cover both will flush more often than necessary. This allows tlb_flush_page_async_work to flush just the one mmu_idx implicated, which in turn allows us to remove tlb_check_page_and_flush_by_mmuidx_async_work. Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-31 12:16:08 +00:00
Richard Henderson	60a2ad7d86	cputlb: Move cpu->pending_tlb_flush to env->tlb_c.pending_flush Protect it with the tlb_lock instead of using atomics. The move puts it in or near the same cacheline as the lock; using the lock means we don't need a second atomic operation in order to perform the update. Which makes it cheap to also update pending_flush in tlb_flush_by_mmuidx_async_work. Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-31 12:16:02 +00:00
Richard Henderson	8ab102667e	cputlb: Remove tcg_enabled hack from tlb_flush_nocheck The bugs this was working around were fixed with commits `022d6378c7` target/unicore32: remove tlb_flush from uc32_init_fn `6e11beecfd` target/alpha: remove tlb_flush from alpha_cpu_initfn Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-31 12:15:57 +00:00
Richard Henderson	53d284554c	cputlb: Move tlb_lock to CPUTLBCommon This is the first of several moves to reduce the size of the CPU_COMMON_TLB macro and improve some locality of refernce. Tested-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-31 12:15:28 +00:00
Emilio G. Cota	403f290c06	cputlb: read CPUTLBEntry.addr_write atomically Updates can come from other threads, so readers that do not take tlb_lock must use atomic_read to avoid undefined behaviour (UB). This completes the conversion to tlb_lock. This conversion results on average in no performance loss, as the following experiments (run on an Intel i7-6700K CPU @ 4.00GHz) show. 1. aarch64 bootup+shutdown test: - Before: Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs): 7487.087786 task-clock (msec) # 0.998 CPUs utilized ( +- 0.12% ) 31,574,905,303 cycles # 4.217 GHz ( +- 0.12% ) 57,097,908,812 instructions # 1.81 insns per cycle ( +- 0.08% ) 10,255,415,367 branches # 1369.747 M/sec ( +- 0.08% ) 173,278,962 branch-misses # 1.69% of all branches ( +- 0.18% ) 7.504481349 seconds time elapsed ( +- 0.14% ) - After: Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs): 7462.441328 task-clock (msec) # 0.998 CPUs utilized ( +- 0.07% ) 31,478,476,520 cycles # 4.218 GHz ( +- 0.07% ) 57,017,330,084 instructions # 1.81 insns per cycle ( +- 0.05% ) 10,251,929,667 branches # 1373.804 M/sec ( +- 0.05% ) 173,023,787 branch-misses # 1.69% of all branches ( +- 0.11% ) 7.474970463 seconds time elapsed ( +- 0.07% ) 2. SPEC06int: SPEC06int (test set) [Y axis: Speedup over master] 1.15 +-+----+------+------+------+------+------+-------+------+------+------+------+------+------+----+-+ \| \| 1.1 +-+.................................+++.............................+ tlb-lock-v2 (m+++x) +-+ \| +++ \| +++ tlb-lock-v3 (spinl\|ck) \| \| +++ \| \| +++ +++ \| \| \| 1.05 +-+....+++...........####.........\|####.+++.\|......\|.....###....+++...........+++....###.........+-+ \| ### ++#\| # \|# \|# *### +++### +++#+# \| +++ \| #\|# ### \| 1 +-++++#++++####+++#++#++++++++++#++#++++#++++#+#+**+#++++###++++###++++###++++#+#++++#+#+++-+ \| +* # #++# * # #### * # * ++# **+# \| * # ***\|# \|# # #\|# #+# # # \| 0.95 +-+....#....#..#.\|..#...#..#.\|..#....#.\|..#.++.#.+++#.**.#....#+#....#.#..++#.#..+-+ \| * # # # \| # # # \| # * * # ++ # * * # * * # * \|* # ++# # # # *** # \| \| * * # ++# # + # # # \| # * * # * * # * * # * * # ++ # **** # ++# # * * # \| 0.9 +-+....#...\|#..#....#.++#..#.\|..#....#....#....#....#....#..\|.#...\|#.#....#..+-+ \| * * # *** # * * # \|# # + # * * # * * # * * # * * # * * # ++ # \|# # * * # \| 0.85 +-+....#..\|..#....#.**..#....#....#....#....#....#....#....#.**.#....#..+-+ \| * # + # * * # \| # * * # * * # * * # * * # * * # * * # * * # * \|* # * * # \| \| * * # * * # * * # + # * * # * * # * * # * * # * * # * * # * * # * \|* # * * # \| 0.8 +-+....#.....#....#....#....#....#....#....#....#....#....#.++.#....#..+-+ \| * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.75 +-+--*##--###-###-###-###-###-*##-##-##-##-##-##--*##--+-+ 400.perlben401.bzip2403.gcc429.m445.gob456.hmme45462.libqua464.h26471.omnet473483.xalancbmkgeomean png: https://imgur.com/a/BHzpPTW Notes: - tlb-lock-v2 corresponds to an implementation with a mutex. - tlb-lock-v3 corresponds to the current implementation, i.e. a spinlock and a single lock acquisition in tlb_set_page_with_attrs. Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181016153840.25877-1-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 19:46:53 -07:00
Richard Henderson	e6cd4bb59b	tcg: Split CONFIG_ATOMIC128 GCC7+ will no longer advertise support for 16-byte __atomic operations if only cmpxchg is supported, as for x86_64. Fortunately, x86_64 still has support for __sync_compare_and_swap_16 and we can make use of that. AArch64 does not have, nor ever has had such support, so open-code it. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 19:46:36 -07:00
Richard Henderson	383beda9cf	tcg: Add tlb_index and tlb_entry helpers Isolate the computation of an index from an address into a helper before we change that function. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> [ cota: convert tlb_vaddr_to_host; use atomic_read on addr_write ] Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181009175129.17888-2-cota@braap.org>	2018-10-18 18:58:10 -07:00
Emilio G. Cota	71aec3541d	cputlb: serialize tlb updates with env->tlb_lock Currently we rely on atomic operations for cross-CPU invalidations. There are two cases that these atomics miss: cross-CPU invalidations can race with either (1) vCPU threads flushing their TLB, which happens via memset, or (2) vCPUs calling tlb_reset_dirty on their TLB, which updates .addr_write with a regular store. This results in undefined behaviour, since we're mixing regular and atomic ops on concurrent accesses. Fix it by using tlb_lock, a per-vCPU lock. All updaters of tlb_table and the corresponding victim cache now hold the lock. The readers that do not hold tlb_lock must use atomic reads when reading .addr_write, since this field can be updated by other threads; the conversion to atomic reads is done in the next patch. Note that an alternative fix would be to expand the use of atomic ops. However, in the case of TLB flushes this would have a huge performance impact, since (1) TLB flushes can happen very frequently and (2) we currently use a full memory barrier to flush each TLB entry, and a TLB has many entries. Instead, acquiring the lock is barely slower than a full memory barrier since it is uncontended, and with a single lock acquisition we can flush the entire TLB. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181009174557.16125-6-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 18:58:10 -07:00
Emilio G. Cota	ea9025cb49	cputlb: fix assert_cpu_is_self macro Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181009174557.16125-5-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 18:58:10 -07:00
Emilio G. Cota	5005e2537d	exec: introduce tlb_init Paves the way for the addition of a per-TLB lock. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181009174557.16125-4-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 18:58:10 -07:00
Peter Maydell	55a7cb144d	accel/tcg: Check whether TLB entry is RAM consistently with how we set it up We set up TLB entries in tlb_set_page_with_attrs(), where we have some logic for determining whether the TLB entry is considered to be RAM-backed, and thus has a valid addend field. When we look at the TLB entry in get_page_addr_code(), we use different logic for determining whether to treat the page as RAM-backed and use the addend field. This is confusing, and in fact buggy, because the code in tlb_set_page_with_attrs() correctly decides that rom_device memory regions not in romd mode are not RAM-backed, but the code in get_page_addr_code() thinks they are RAM-backed. This typically results in "Bad ram pointer" assertion if the guest tries to execute from such a memory region. Fix this by making get_page_addr_code() just look at the TLB_MMIO bit in the code_address field of the TLB, which tlb_set_page_with_attrs() sets if and only if the addend field is not valid for code execution. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20180713150945.12348-1-peter.maydell@linaro.org	2018-08-14 17:17:19 +01:00
Peter Maydell	20cb6ae472	accel/tcg: Return -1 for execution from MMIO regions in get_page_addr_code() Now that all the callers can handle get_page_addr_code() returning -1, remove all the code which tries to handle execution from MMIO regions or small-MMU-region RAM areas. This will mean that we can correctly execute from these areas, rather than ending up either aborting QEMU or delivering an incorrect guest exception. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Cédric Le Goater <clg@kaod.org> Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20180710160013.26559-6-peter.maydell@linaro.org	2018-08-14 17:17:19 +01:00
Peter Maydell	dbea78a4d6	accel/tcg: Pass read access type through to io_readx() The io_readx() function needs to know whether the load it is doing is an MMU_DATA_LOAD or an MMU_INST_FETCH, so that it can pass the right value to the cpu_transaction_failed() function. Plumb this information through from the softmmu code. This is currently not often going to give the wrong answer, because usually instruction fetches go via get_page_addr_code(). However once we switch over to handling execution from non-RAM by creating single-insn TBs, the path for an insn fetch to generate a bus error will be through cpu_ld*_code() and io_readx(), so without this change we will generate a d-side fault when we should generate an i-side fault. We also have to pass the access type via a CPU struct global down to unassigned_mem_read(), for the benefit of the targets which still use the cpu_unassigned_access() hook (m68k, mips, sparc, xtensa). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Cédric Le Goater <clg@kaod.org> Message-id: 20180710160013.26559-2-peter.maydell@linaro.org	2018-08-14 17:17:19 +01:00
Peter Maydell	3474c98a2a	accel/tcg: Assert that tlb fill gave us a valid TLB entry In commit `4b1a3e1e34` we added a check for whether the TLB entry we had following a tlb_fill had the INVALID bit set. This could happen in some circumstances because a stale or wrong TLB entry was pulled out of the victim cache. However, after commit `68fea03855` (which prevents stale entries being in the victim cache) and the previous commit (which ensures we don't incorrectly hit in the victim cache)) this should never be possible. Drop the check on TLB_INVALID_MASK from the "is this a TLB_RECHECK?" condition, and instead assert that the tlb fill procedure has given us a valid TLB entry (or longjumped out with a guest exception). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180713141636.18665-3-peter.maydell@linaro.org	2018-07-16 17:26:01 +01:00
Peter Maydell	b493ccf1fc	accel/tcg: Use correct test when looking in victim TLB for code In get_page_addr_code(), we were incorrectly looking in the victim TLB for an entry which matched the target address for reads, not for code accesses. This meant that we could hit on a victim TLB entry that indicated that the address was readable but not executable, and incorrectly bypass the call to tlb_fill() which should generate the guest MMU exception. Fix this bug. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180713141636.18665-2-peter.maydell@linaro.org	2018-07-16 17:26:01 +01:00
Richard Henderson	68fea03855	accel/tcg: Avoid caching overwritten tlb entries When installing a TLB entry, remove any cached version of the same page in the VTLB. If the existing TLB entry matches, do not copy into the VTLB, but overwrite it. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-07-02 08:05:16 -07:00
Peter Maydell	4b1a3e1e34	accel/tcg: Don't treat invalid TLB entries as needing recheck In get_page_addr_code() when we check whether the TLB entry is marked as TLB_RECHECK, we should not go down that code path if the TLB entry is not valid at all (ie the TLB_INVALID bit is set). Tested-by: Laurent Vivier <laurent@vivier.eu> Reported-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20180629161731.16239-1-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-07-02 08:02:20 -07:00
Peter Maydell	e4c967a720	accel/tcg: Correct "is this a TLB miss" check in get_page_addr_code() In commit `71b9a45330` we changed the condition we use to determine whether we need to refill the TLB in get_page_addr_code() to if (unlikely(env->tlb_table[mmu_idx][index].addr_code != (addr & (TARGET_PAGE_MASK \| TLB_INVALID_MASK)))) { This isn't the right check (it will falsely fail if the input addr happens to have the low bit corresponding to TLB_INVALID_MASK set, for instance). Replace it with a use of the new tlb_hit() function, which is the correct test. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20180629162122.19376-3-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-07-02 08:02:20 -07:00
Peter Maydell	334692bce7	tcg: Define and use new tlb_hit() and tlb_hit_page() functions The condition to check whether an address has hit against a particular TLB entry is not completely trivial. We do this in various places, and in fact in one place (get_page_addr_code()) we have got the condition wrong. Abstract it out into new tlb_hit() and tlb_hit_page() inline functions (one for a known-page-aligned address and one for an arbitrary address), and use them in all the places where we had the condition correct. This is a no-behaviour-change patch; we leave fixing the buggy code in get_page_addr_code() to a subsequent patch. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-Id: <20180629162122.19376-2-peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-07-02 08:02:20 -07:00
Peter Maydell	55df6fcf54	tcg: Support MMU protection regions smaller than TARGET_PAGE_SIZE Add support for MMU protection regions that are smaller than TARGET_PAGE_SIZE. We do this by marking the TLB entry for those pages with a flag TLB_RECHECK. This flag causes us to always take the slow-path for accesses. In the slow path we can then special case them to always call tlb_fill() again, so we have the correct information for the exact address being accessed. This change allows us to handle reading and writing from small regions; we cannot deal with execution from the small region. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180620130619.11362-2-peter.maydell@linaro.org	2018-06-26 17:50:41 +01:00
Emilio G. Cota	b7542f7fe8	cputlb: remove tb_lock from tlb_flush functions The acquisition of tb_lock was added when the async tlb_flush was introduced in `e3b9ca810` ("cputlb: introduce tlb_flush_* async work.") tb_lock was there to allow us to do memset() on the tb_jmp_cache's. However, since `f3ced3c592` ("tcg: consistently access cpu->tb_jmp_cache atomically") all accesses to tb_jmp_cache are atomic, so tb_lock is not needed here. Get rid of it. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-06-15 08:18:48 -10:00
Peter Maydell	1f871c5e6b	exec.c: Handle IOMMUs in address_space_translate_for_iotlb() Currently we don't support board configurations that put an IOMMU in the path of the CPU's memory transactions, and instead just assert() if the memory region fonud in address_space_translate_for_iotlb() is an IOMMUMemoryRegion. Remove this limitation by having the function handle IOMMUs. This is mostly straightforward, but we must make sure we have a notifier registered for every IOMMU that a transaction has passed through, so that we can flush the TLB appropriately when any of the IOMMUs change their mappings. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20180604152941.20374-5-peter.maydell@linaro.org	2018-06-15 15:23:34 +01:00
Peter Maydell	2d54f19401	cputlb: Pass cpu_transaction_failed() the correct physaddr The API for cpu_transaction_failed() says that it takes the physical address for the failed transaction. However we were actually passing it the offset within the target MemoryRegion. We don't currently have any target CPU implementations of this hook that require the physical address; fix this bug so we don't get confused if we ever do add one. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180611125633.32755-3-peter.maydell@linaro.org	2018-06-15 15:23:34 +01:00
Peter Maydell	ace4109011	cpu-defs.h: Document CPUIOTLBEntry 'addr' field The 'addr' field in the CPUIOTLBEntry struct has a rather non-obvious use; add a comment documenting it (reverse-engineered from what the code that sets it is doing). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20180611125633.32755-2-peter.maydell@linaro.org	2018-06-15 15:23:34 +01:00
Laurent Vivier	98670d47cd	accel/tcg: add size paremeter in tlb_fill() The MC68040 MMU provides the size of the access that triggers the page fault. This size is set in the Special Status Word which is written in the stack frame of the access fault exception. So we need the size in m68k_cpu_unassigned_access() and m68k_cpu_handle_mmu_fault(). To be able to do that, this patch modifies the prototype of handle_mmu_fault handler, tlb_fill() and probe_write(). do_unassigned_access() already includes a size parameter. This patch also updates handle_mmu_fault handlers and tlb_fill() of all targets (only parameter, no code change). Signed-off-by: Laurent Vivier <laurent@vivier.eu> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20180118193846.24953-2-laurent@vivier.eu>	2018-01-25 16:02:24 +01:00
Peter Maydell	34d49937e4	accel/tcg: Handle atomic accesses to notdirty memory correctly To do a write to memory that is marked as notdirty, we need to invalidate any TBs we have cached for that memory, and update the cpu physical memory dirty flags for VGA and migration. The slowpath code in notdirty_mem_write() does all this correctly, but the new atomic handling code in atomic_mmu_lookup() doesn't do anything at all, it just clears the dirty bit in the TLB. The effect of this bug is that if the first write to a notdirty page for which we have cached TBs is by a guest atomic access, we fail to invalidate the TBs and subsequently will execute incorrect code. This can be seen by trying to run 'javac' on AArch64. Use the new notdirty_call_before() and notdirty_call_after() functions to correctly handle the update to notdirty memory in the atomic codepath. Cc: qemu-stable@nongnu.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 1511201308-23580-3-git-send-email-peter.maydell@linaro.org	2017-11-21 12:09:25 +00:00
Richard Henderson	ec603b5584	tcg: Record code_gen_buffer address for user-only memory helpers When we handle a signal from a fault within a user-only memory helper, we cannot cpu_restore_state with the PC found within the signal frame. Use a TLS variable, helper_retaddr, to record the unwind start point to find the faulting guest insn. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2017-11-15 10:33:27 +01:00

1 2

60 Commits