qemu-e2k

Author	SHA1	Message	Date
Dr. David Alan Gilbert	89cb0c0403	typo: apci->acpi apci_1_compatible should be acpi_1_compatible. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20190125094047.22276-1-dgilbert@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-01-30 10:16:58 +01:00
kumar sourav	aae049073c	hw: edu: set category of the edu device Sets the category of edu device as DEVICE_CATEGORY_MISC. Devices should be assigned to one of DEVICE_CATEGORY_XXXX. Signed-off-by: kumar sourav <sourav.jb1988@gmail.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20190124144606.4352-1-sourav.jb1988@gmail.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2019-01-30 10:16:00 +01:00
Peter Maydell	b4fbe1f65a	target-arm queue: * Fix validation of 32-bit address spaces for aa32 (fixes an assert introduced in `ba97be9f4a`) * v8m: Ensure IDAU is respected if SAU is disabled * gdbstub: fix gdb_get_cpu(s, pid, tid) when pid and/or tid are 0 * exec.c: Use correct attrs in cpu_memory_rw_debug() * accel/tcg/user-exec: Don't parse aarch64 insns to test for read vs write * target/arm: Don't clear supported PMU events when initializing PMCEID1 * memory: add memory_region_flush_rom_device() * microbit: Add stub NRF51 TWI magnetometer/accelerometer detection * tests/microbit-test: extend testing of microbit devices * checkpatch: Don't emit spurious warnings about block comments * aspeed/smc: misc bug fixes * xlnx-zynqmp: Don't create rpu-cluster if there are no RPUs * xlnx-zynqmp: Realize cluster after putting RPUs in it * accel/tcg: Add cluster number to TCG TB hash so differently configured CPUs don't pick up cached TBs for the wrong kind of CPU -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAlxQQA4ZHHBldGVyLm1h eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3rFzD/4mOedFLCEubFGFgpwEQf3P g/oEIOPm6xNFfk/cZPUwBqoANBIxb5sTZCJuNyEWEpziNLtxc+BRzq21UwazsaGV fxBJr3XHZxeX/XeQD8SioPeMeW9SGoOmj4Wigu3dMcQ8wqeOUb0/lGVdqCOJoPmV h7CCW0OufmX9PpLYR/IgFFfCeIGBxCqE+Kld0ZUlXs70ax8JxJK9JsbamJhEpDtC A888r2xbVlj9Pa9eu8MBKqaLF3KS8qWrgrBssPsX3I28p3MykjQAHgnVlNP1X0uo h03+9DiCbCPq/+hr763VaIrbugYeFL1zTo7B7C2cCgwPso2zsiExYR96/0hkZ/6b Yov759LeidcmxoelrJU5AUBgLonWAV4S0lbTN54pxPJcs4l9cvF65U8mgEs7ONAs mbCrLT5plzJrEK/Qj2pVJLY8OggBBg2/jNTW/CRda8Xdh9fNgKeTK2vLDsFygexk ARSOZ4yEzCP6HlSJ7z3s3IEEg1VUv41SY5mWhkhJU5s0WOe/UD5b3KLj7PMkD9bu +ITHKQ+/2OO46klhNWQ3Ipcjz/jHruAFCQQFtHeiK9pFQGgcAbEHzZrmVVoXmk0N cyNk9+e/hiRm/x6IWzJidV8x3iEEtRoNiukwd7jEgkUg99SiGevPrLYD6GVYnZsd R9PYmo6/X2Cjr/SFcOY9Fg== =SzmU -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20190129' into staging target-arm queue: * Fix validation of 32-bit address spaces for aa32 (fixes an assert introduced in `ba97be9f4a`) * v8m: Ensure IDAU is respected if SAU is disabled * gdbstub: fix gdb_get_cpu(s, pid, tid) when pid and/or tid are 0 * exec.c: Use correct attrs in cpu_memory_rw_debug() * accel/tcg/user-exec: Don't parse aarch64 insns to test for read vs write * target/arm: Don't clear supported PMU events when initializing PMCEID1 * memory: add memory_region_flush_rom_device() * microbit: Add stub NRF51 TWI magnetometer/accelerometer detection * tests/microbit-test: extend testing of microbit devices * checkpatch: Don't emit spurious warnings about block comments * aspeed/smc: misc bug fixes * xlnx-zynqmp: Don't create rpu-cluster if there are no RPUs * xlnx-zynqmp: Realize cluster after putting RPUs in it * accel/tcg: Add cluster number to TCG TB hash so differently configured CPUs don't pick up cached TBs for the wrong kind of CPU # gpg: Signature made Tue 29 Jan 2019 11:59:10 GMT # gpg: using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE # gpg: issuer "peter.maydell@linaro.org" # gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@gmail.com>" [ultimate] # gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [ultimate] # Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE * remotes/pmaydell/tags/pull-target-arm-20190129: (23 commits) gdbstub: Simplify gdb_get_cpu_pid() to use cpu->cluster_index accel/tcg: Add cluster number to TCG TB hash qom/cpu: Add cluster_index to CPUState hw/arm/xlnx-zynqmp: Realize cluster after putting RPUs in it aspeed/smc: snoop SPI transfers to fake dummy cycles aspeed/smc: Add dummy data register aspeed/smc: define registers for all possible CS aspeed/smc: fix default read value xlnx-zynqmp: Don't create rpu-cluster if there are no RPUs checkpatch: Don't emit spurious warnings about block comments tests/microbit-test: Check nRF51 UART functionality tests/microbit-test: Make test independent of global_qtest tests/libqtest: Introduce qtest_init_with_serial() memory: add memory_region_flush_rom_device() target/arm: Don't clear supported PMU events when initializing PMCEID1 MAINTAINERS: update microbit ARM board files accel/tcg/user-exec: Don't parse aarch64 insns to test for read vs write exec.c: Use correct attrs in cpu_memory_rw_debug() tests/microbit-test: add TWI stub device test arm: Stub out NRF51 TWI magnetometer/accelerometer detection ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 12:00:19 +00:00
Peter Maydell	46f5abc0a2	gdbstub: Simplify gdb_get_cpu_pid() to use cpu->cluster_index Now we're keeping the cluster index in the CPUState, we don't need to jump through hoops in gdb_get_cpu_pid() to find the associated cluster object. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Luc Michel <luc.michel@greensocs.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Message-id: 20190121152218.9592-5-peter.maydell@linaro.org	2019-01-29 11:46:06 +00:00
Peter Maydell	f7b78602fd	accel/tcg: Add cluster number to TCG TB hash Include the cluster number in the hash we use to look up TBs. This is important because a TB that is valid for one cluster at a given physical address and set of CPU flags is not necessarily valid for another: the two clusters may have different views of physical memory, or may have different CPU features (eg FPU present or absent). We put the cluster number in the high 8 bits of the TB cflags. This gives us up to 256 clusters, which should be enough for anybody. If we ever need more, or need more bits in cflags for other purposes, we could make tb_hash_func() take more data (and expand qemu_xxhash7() to qemu_xxhash8()). Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Message-id: 20190121152218.9592-4-peter.maydell@linaro.org	2019-01-29 11:46:06 +00:00
Peter Maydell	7ea7b9ad53	qom/cpu: Add cluster_index to CPUState For TCG we want to distinguish which cluster a CPU is in, and we need to do it quickly. Cache the cluster index in the CPUState struct, by having the cluster object set cpu->cluster_index for each CPU child when it is realized. This means that board/SoC code must add all CPUs to the cluster before realizing the cluster object. Regrettably QOM provides no way to prevent adding children to a realized object and no way for the parent to be notified when a new child is added to it, so we don't have any way to enforce/assert this constraint; all we can do is document it in a comment. We can at least put in a check that the cluster contains at least one CPU, which should catch the typical cases of "realized cluster too early" or "forgot to parent the CPUs into it". The restriction on how many clusters can exist in the system is imposed by TCG code which will be added in a subsequent commit, but the check to enforce it in cluster.c fits better in this one. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Message-id: 20190121152218.9592-3-peter.maydell@linaro.org	2019-01-29 11:46:05 +00:00
Peter Maydell	fa43442465	hw/arm/xlnx-zynqmp: Realize cluster after putting RPUs in it Currently the cluster implementation doesn't have any constraints on the ordering of realizing the TYPE_CPU_CLUSTER and populating it with child objects. We want to impose a constraint that realize must happen only after all the child objects are added, so move the realize of rpu_cluster. (The apu_cluster is already realized after child population.) Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Luc Michel <luc.michel@greensocs.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Message-id: 20190121152218.9592-2-peter.maydell@linaro.org	2019-01-29 11:46:05 +00:00
Cédric Le Goater	f95c4bffdc	aspeed/smc: snoop SPI transfers to fake dummy cycles The m25p80 models dummy cycles using byte transfers. This works well when the transfers are initiated by the QEMU model of a SPI controller but when these are initiated by the OS, it breaks emulation. Snoop the SPI transfer to catch commands requiring dummy cycles and replace them with byte transfers compatible with the m25p80 model. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com> Message-id: 20190124140519.13838-5-clg@kaod.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:05 +00:00
Cédric Le Goater	9149af2a2d	aspeed/smc: Add dummy data register The SMC controllers have a register containing the byte that will be used as dummy output. It can be modified by software. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Reviewed-by: Joel Stanley <joel@jms.id.au> Message-id: 20190124140519.13838-4-clg@kaod.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:05 +00:00
Cédric Le Goater	597d6bb3e8	aspeed/smc: define registers for all possible CS The model should expose one control register per possible CS. When testing the validity of the register number in the read operation, replace 's->num_cs' by 'ctrl->max_slaves' which represents the maximum number of flash devices a controller can handle. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Joel Stanley <joel@jms.id.au> Message-id: 20190124140519.13838-3-clg@kaod.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:05 +00:00
Cédric Le Goater	b617ca9223	aspeed/smc: fix default read value 0xFFFFFFFF should be returned for non implemented registers. Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: Joel Stanley <joel@jms.id.au> Message-id: 20190124140519.13838-2-clg@kaod.org Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:05 +00:00
Peter Maydell	e5b517536c	xlnx-zynqmp: Don't create rpu-cluster if there are no RPUs If we aren't going to create any RPUs, then don't create the rpu-cluster unit. This allows us to add an assertion to the cluster object that it contains at least one CPU, which helps to avoid bugs in creating clusters and putting CPUs in them. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190121184314.14311-1-peter.maydell@linaro.org	2019-01-29 11:46:05 +00:00
Peter Maydell	b94e809d3e	checkpatch: Don't emit spurious warnings about block comments In checkpatch we attempt to check for and warn about block comments which start with /* or / followed by a non-blank. Unfortunately a bug in the regex meant that we would incorrectly warn about comments starting with "/" with no following text: git show 9813dc6ac3954d58ba16b3920556f106f97e1c67\|./scripts/checkpatch.pl - WARNING: Block comments use a leading /* on a separate line #34: FILE: tests/libqtest.h:233: +/** The sequence "/\\?" was intended to match either "/" or "/", but Perl's semantics for '?' allow it to backtrack and try the "matches 0 chars" option if the "matches 1 char" choice leads to a failure of the rest of the regex to match. Switch to "/\\?+" which uses what perlre(1) calls the "possessive" quantifier form: this means that if it matches the "/" string it will not later backtrack to matching just the "/" prefix. The other end of the regex is also wrong: it is attempting to check for "/* or /** followed by something that isn't just whitespace", but [ \t].+[ \t] will match on pure whitespace. This is less significant but means that a line with just a comment-starter followed by trailing whitespace will generate an incorrect warning about block comment style as well as the correct error about trailing whitespace which a different checkpatch test emits. Fixes: `8c06fbdf36` ("scripts/checkpatch.pl: Enforce multiline comment syntax") Reported-by: Thomas Huth <thuth@redhat.com> Reported-by: Eric Blake <eblake@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20190118165050.22270-1-peter.maydell@linaro.org	2019-01-29 11:46:05 +00:00
Julia Suvorova	46a6603ba6	tests/microbit-test: Check nRF51 UART functionality Some functional tests for: Basic reception/transmittion Suspending INTEN* registers Signed-off-by: Julia Suvorova <jusual@mail.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Message-id: 20190123120759.7162-4-jusual@mail.ru Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:04 +00:00
Julia Suvorova	7cf19e7352	tests/microbit-test: Make test independent of global_qtest Using of global_qtest is not required here. Let's replace functions like readl() with the corresponding qtest_* counterparts. Signed-off-by: Julia Suvorova <jusual@mail.ru> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20190123120759.7162-3-jusual@mail.ru Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:04 +00:00
Julia Suvorova	6c90a82c93	tests/libqtest: Introduce qtest_init_with_serial() Run qtest with a socket that connects QEMU chardev and test code. Signed-off-by: Julia Suvorova <jusual@mail.ru> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Message-id: 20190123120759.7162-2-jusual@mail.ru Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:04 +00:00
Stefan Hajnoczi	047be4ed24	memory: add memory_region_flush_rom_device() ROM devices go via MemoryRegionOps->write() callbacks for write operations and do not dirty/invalidate that memory. Device emulation must be able to mark memory ranges that have been modified internally (e.g. using memory_region_get_ram_ptr()). Introduce the memory_region_flush_rom_device() API for this purpose. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20190123212234.32068-2-stefanha@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> [PMM: fix block comment style] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:04 +00:00
Aaron Lindsay OS	bf8d09694c	target/arm: Don't clear supported PMU events when initializing PMCEID1 A bug was introduced during a respin of: commit `57a4a11b2b` target/arm: Add array for supported PMU events, generate PMCEID[01]_EL0 This patch introduced two calls to get_pmceid() during CPU initialization - one each for PMCEID0 and PMCEID1. In addition to building the register values, get_pmceid() clears an internal array mapping event numbers to their implementations (supported_event_map) before rebuilding it. This is an optimization since much of the logic is shared. However, since it was called twice, the contents of supported_event_map reflect only the events in PMCEID1 (the second call to get_pmceid()). Fix this bug by moving the initialization of PMCEID0 and PMCEID1 back into a single function call, and name it more appropriately since it is doing more than simply generating the contents of the PMCEID[01] registers. Signed-off-by: Aaron Lindsay <aaron@os.amperecomputing.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190123195814.29253-1-aaron@os.amperecomputing.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:04 +00:00
Stefan Hajnoczi	c8de3f5fd6	MAINTAINERS: update microbit ARM board files New source files were added without corresponding ./MAINTAINERS file entries. Let's get things up to date. Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190123183352.11025-1-stefanha@redhat.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:04 +00:00
Peter Maydell	f454a54f3b	accel/tcg/user-exec: Don't parse aarch64 insns to test for read vs write In cpu_signal_handler() for aarch64 hosts, currently we parse the faulting instruction to see if it is a load or a store. Since the 3.16 kernel (~2014), the kernel has provided us with the syndrome register for a fault, which includes the WnR bit. Use this instead if it is present, only falling back to instruction parsing if not. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190108180014.32386-1-peter.maydell@linaro.org	2019-01-29 11:46:04 +00:00
Peter Maydell	ea7a5330b7	exec.c: Use correct attrs in cpu_memory_rw_debug() In the softmmu version of cpu_memory_rw_debug(), we ask the CPU for the attributes to use for the virtual memory access, and we correctly use those to identify the address space index. However, we were not passing them in to the address_space_write_rom() and address_space_rw() functions. The effect of this was that a memory access from the gdbstub to a device which had behaviour that was sensitive to the memory attributes (such as some ARMv8M NVIC registers) was incorrectly always performed as if non-secure, rather than using the right security state for the CPU's current state. Fixes: https://bugs.launchpad.net/qemu/+bug/1812091 Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20190117133834.7480-1-peter.maydell@linaro.org	2019-01-29 11:46:04 +00:00
Stefan Hajnoczi	b36356f699	tests/microbit-test: add TWI stub device test This test verifies that we read back the expected I2C WHO_AM_I register values for the accelerometer/magnetometer. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20190110094020.18354-3-stefanha@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:03 +00:00
Steffen Görtz	9d68bf564e	arm: Stub out NRF51 TWI magnetometer/accelerometer detection Recent microbit firmwares panic if the TWI magnetometer/accelerometer devices are not detected during startup. We don't implement TWI (I2C) so let's stub out these devices just to let the firmware boot. Signed-off by: Steffen Görtz <contrib@steffen-goertz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-id: 20190110094020.18354-2-stefanha@redhat.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> [PMM: fixed comment style] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:03 +00:00
Luc Michel	ab65eed3f8	gdbstub: fix gdb_get_cpu(s, pid, tid) when pid and/or tid are 0 a TID or PID value means "any thread" (resp. "any process"). This commit fixes the different combinations when at least one value is 0. When both are 0, the function now returns the first attached CPU, instead of the CPU with TID 1, which is not necessarily attached or even existent. When PID is specified but TID is 0, the function returns the first CPU in the process, or NULL if the process does not exist or is not attached. In other cases, it returns the corresponding CPU, while ignoring the PID check when PID is 0. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Luc Michel <luc.michel@greensocs.com> Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20190119140000.11767-1-luc.michel@greensocs.com Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:03 +00:00
Thomas Roth	7e3f122367	target/arm: v8m: Ensure IDAU is respected if SAU is disabled The current behavior of v8m_security_lookup in helper.c only checks whether the IDAU specifies a higher security if the SAU is enabled. If SAU.ALLNS is set to 1, this will lead to addresses being treated as non-secure, even though the IDAU indicates that they must be secure. This patch changes the behavior to also check the IDAU if the SAU is currently disabled. (This brings the behaviour here into line with the v8M Arm ARM SecurityCheck() pseudocode.) Signed-off-by: Thomas Roth <code@stacksmashing.net> Message-id: CAGGekkuc+-tvp5RJP7CM+Jy_hJF7eiRHZ96132sb=hPPCappKg@mail.gmail.com Reviewed-by: Peter Maydell <peter.maydell@linaro.org> [PMM: added pseudocode ref to the commit message, fixed comment style] Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:03 +00:00
Richard Henderson	36d820af0e	target/arm: Fix validation of 32-bit address spaces for aa32 When tsz == 0, aarch32 selects the address space via exclusion, and there are no "top_bits" remaining that require validation. Fixes: `ba97be9f4a` Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Message-id: 20190125184913.5970-1-richard.henderson@linaro.org Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-29 11:46:03 +00:00
Peter Maydell	3a183e330d	Backend vector enhancements Dynamic tlb resizing -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcTyZfAAoJEGTfOOivfiFf/XIH/2uG8YTamq97ZMALuzSMUD1O RApi8FRghk4M1SMrZv9KAnR3IcYl8Y8Qjlj7LDytD4axVG+1PdPsOwIiVThd3a0d yYB510vCr1nBi0d7an70Ks2n5v0pCm/Q5/WK00F03Swg/eeUGVjiyVhXUQDAdJ8M wI8Qi2eIF7Y2Pin+kXvEvHQwkrCYoRV5V8c+gW7DfuPM9rfjZ2ieAWisUeRkWJuT QVwVEjbAts1RH1JLe7M4DZYaaHoHjjhssG4WUWVt5CVtZBnb10raoRZYR69bNT+w f3LTvpY2Ga0K+rQJa90hWig5dbpgUQ2nOBCU0B6/Ee/SRxo74HQEIzhKM8TjMYw= =NT5a -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20190128' into staging Backend vector enhancements Dynamic tlb resizing # gpg: Signature made Mon 28 Jan 2019 15:57:19 GMT # gpg: using RSA key 64DF38E8AF7E215F # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full] # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-tcg-20190128: (23 commits) cputlb: Remove static tlb sizing tcg/tci: enable dynamic TLB sizing tcg/mips: enable dynamic TLB sizing tcg/mips: Fix tcg_out_qemu_ld_slow_path tcg/arm: enable dynamic TLB sizing tcg/riscv: enable dynamic TLB sizing tcg/s390: enable dynamic TLB sizing tcg/sparc: enable dynamic TLB sizing tcg/ppc: enable dynamic TLB sizing tcg/aarch64: enable dynamic TLB sizing tcg/i386: enable dynamic TLB sizing tcg: introduce dynamic TLB sizing cputlb: do not evict empty entries to the vtlb tcg/aarch64: Implement vector minmax arithmetic tcg/aarch64: Implement vector saturating arithmetic tcg/i386: Implement vector minmax arithmetic tcg/i386: Implement vector saturating arithmetic tcg/i386: Split subroutines out of tcg_expand_vec_op tcg: Add opcodes for vector minmax arithmetic tcg: Add opcodes for vector saturated arithmetic ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-01-28 16:26:47 +00:00
Richard Henderson	e77c89fb08	cputlb: Remove static tlb sizing Now that all tcg backends support TCG_TARGET_IMPLEMENTS_DYN_TLB, remove the define and the old code. Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:35 -08:00
Richard Henderson	0a9a83d6bf	tcg/tci: enable dynamic TLB sizing This is automatic due to TCI using the other softtlb macros. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	ac33373e0e	tcg/mips: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	a31aa4ce00	tcg/mips: Fix tcg_out_qemu_ld_slow_path Patch the branch after it has been emitted rather than before it exists. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	cd7d3cb7a2	tcg/arm: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	41b70f220b	tcg/riscv: enable dynamic TLB sizing Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:24 -08:00
Richard Henderson	4f47e338f6	tcg/s390: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	17ff9f7801	tcg/sparc: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	644f591ab0	tcg/ppc: enable dynamic TLB sizing Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Richard Henderson	f7bcd96669	tcg/aarch64: enable dynamic TLB sizing Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:04:10 -08:00
Emilio G. Cota	54eaf40b8f	tcg/i386: enable dynamic TLB sizing As the following experiments show, this series is a net perf gain, particularly for memory-heavy workloads. Experiments are run on an Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz. 1. System boot + shudown, debian aarch64: - Before (v3.1.0): Performance counter stats for './die.sh v3.1.0' (10 runs): 9019.797015 task-clock (msec) # 0.993 CPUs utilized ( +- 0.23% ) 29,910,312,379 cycles # 3.316 GHz ( +- 0.14% ) 54,699,252,014 instructions # 1.83 insn per cycle ( +- 0.08% ) 10,061,951,686 branches # 1115.541 M/sec ( +- 0.08% ) 172,966,530 branch-misses # 1.72% of all branches ( +- 0.07% ) 9.084039051 seconds time elapsed ( +- 0.23% ) - After: Performance counter stats for './die.sh tlb-dyn-v5' (10 runs): 8624.084842 task-clock (msec) # 0.993 CPUs utilized ( +- 0.23% ) 28,556,123,404 cycles # 3.311 GHz ( +- 0.13% ) 51,755,089,512 instructions # 1.81 insn per cycle ( +- 0.05% ) 9,526,513,946 branches # 1104.641 M/sec ( +- 0.05% ) 166,578,509 branch-misses # 1.75% of all branches ( +- 0.19% ) 8.680540350 seconds time elapsed ( +- 0.24% ) That is, a 4.4% perf increase. 2. System boot + shutdown, ubuntu 18.04 x86_64: - Before (v3.1.0): 56100.574751 task-clock (msec) # 1.016 CPUs utilized ( +- 4.81% ) 200,745,466,128 cycles # 3.578 GHz ( +- 5.24% ) 431,949,100,608 instructions # 2.15 insn per cycle ( +- 5.65% ) 77,502,383,330 branches # 1381.490 M/sec ( +- 6.18% ) 844,681,191 branch-misses # 1.09% of all branches ( +- 3.82% ) 55.221556378 seconds time elapsed ( +- 5.01% ) - After: 56603.419540 task-clock (msec) # 1.019 CPUs utilized ( +- 10.19% ) 202,217,930,479 cycles # 3.573 GHz ( +- 10.69% ) 439,336,291,626 instructions # 2.17 insn per cycle ( +- 14.14% ) 80,538,357,447 branches # 1422.853 M/sec ( +- 16.09% ) 776,321,622 branch-misses # 0.96% of all branches ( +- 3.77% ) 55.549661409 seconds time elapsed ( +- 10.44% ) No improvement (within noise range). Note that for this workload, increasing the time window too much can lead to perf degradation, since it flushes the TLB very frequently. 3. x86_64 SPEC06int: x86_64-softmmu speedup vs. v3.1.0 for SPEC06int (test set) Host: Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz (Skylake) 5.5 +------------------------------------------------------------------------+ \| +-+ \| 5 \|-+.................+-+...............................tlb-dyn-v5.......+-\| \| * * \| 4.5 \|-+..................................................................+-\| \| * * \| 4 \|-+..................................................................+-\| \| * * \| 3.5 \|-+..................................................................+-\| \| * * \| 3 \|-+......+-+........................................................+-\| \| * * * \| 2.5 \|-+.................................................+-+...........+-\| \| * * * * * \| 2 \|-+..............................................................+-\| \| * * * * * * +-+ \| 1.5 \|-+....................................................+-+.+-+.+-\| \| * * +-+ * +-+ +-+ +-+ +-+ * * * * * \| 1 \|++++-++++++++++++++++-+++-+++-++++-++++-++++++++++++++\| \| * * * * * * * * * * * * * * * * * * * * * * * * * * \| 0.5 +------------------------------------------------------------------------+ 400.perlb401.bzip403.g429445.g456.hm462.libq464.h471.omn47483.xalancbgeomean png: https://imgur.com/YRF90f7 That is, a 1.51x average speedup over the baseline, with a max speedup of 5.17x. Here's a different look at the SPEC06int results, using KVM as the baseline: x86_64-softmmu slowdown vs. KVM for SPEC06int (test set) Host: Intel(R) Xeon(R) Gold 6142 CPU @ 2.60GHz (Skylake) 25 +---------------------------------------------------------------------------+ \| +-+ +-+ \| \| * * +-+ v3.1.0 \| \| * * +-+ tlb-dyn-v5 \| \| * * * * +-+ \| 20 \|-+................................................+-+...............+-\| \| * * # # * * \| \| +-+ * * * # # * * \| \| * * * * * # # * * \| 15 \|-+..............................................#.#.......+-+......+-\| \| * * * * * # # * #\|# \| \| * * * * +-+ * # # * +-+ \| \| * * +-+ * * ++-+ +-+ * # # * # # +-+ \| \| * * +-+ * * * ## \| +-+ # # * # # +-+ \| 10 \|-+..........+-+...........##.......++-+..+-+.#.#.......#.#....+-\| \| * * +-+ * * * ## +-+ # # # #* # # +-+ * # # * * \| \| * * * # # * * +-+ * ## * +-+ # # # #* # # * * * # # +-+ \| \| * * # # * * * +-+ * ## * # # # # # #* # # * * * # # * ## \| 5 \|-+.......+-+.#.#.....#.#..##..#.#.#.#..#.#.#.#.....#.#..##.+-\| \| * # #* # # * +-+* # # * ## * # # # # # #* # # * * * # # * ## \| \| * # #* # # * # #* # # * ## * # # # # # #* # # * +-+* # # * ## \| \| ++-+ * # #* # # * # #* # # * ## * # # # # # #* # # * # #* # # * ## \| \|+++#+#++#+#+#+#++#+#+#+#++##++#+#+#+#++#+#+#+#++#+#+#+#+*+##+++\| 0 +---------------------------------------------------------------------------+ 400.perlbe401.bzi403.gc429445.go456.h462.libqu464.h471.omne4483.xalancbmgeomean png: https://imgur.com/YzAMNEV After this series, we bring down the average SPEC06int slowdown vs KVM from 11.47x to 7.58x. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190116170114.26802-4-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Emilio G. Cota	86e1eff8bc	tcg: introduce dynamic TLB sizing Disabled in all TCG backends for now. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190116170114.26802-3-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Emilio G. Cota	3cea94bbc9	cputlb: do not evict empty entries to the vtlb Currently we evict an entry to the victim TLB when it doesn't match the current address. But it could be that there's no match because the current entry is empty (i.e. all -1's, for instance via tlb_flush). Do not evict the entry to the vtlb in that case. This change will help us keep track of the TLB's use rate, which we'll use to implement a policy for dynamic TLB sizing. Tested-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20190116170114.26802-2-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	93f332a503	tcg/aarch64: Implement vector minmax arithmetic Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	d32648d445	tcg/aarch64: Implement vector saturating arithmetic Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	bc37faf4cb	tcg/i386: Implement vector minmax arithmetic The avx instruction set does not directly provide MO_64. We can still implement 64-bit with comparison and vpblendvb. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	8ffafbcec2	tcg/i386: Implement vector saturating arithmetic Only MO_8 and MO_16 are implemented, since that's all the instruction set provides. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	44f1441dbe	tcg/i386: Split subroutines out of tcg_expand_vec_op This routine was becoming too large. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	dd0a0fcdd8	tcg: Add opcodes for vector minmax arithmetic Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	8afaf05066	tcg: Add opcodes for vector saturated arithmetic Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	5d6acdd4a4	tcg: Add write_aofs to GVecGen4 This allows writing 2 output, 3 input operations. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	f550805d83	tcg: Add gvec expanders for nand, nor, eqv Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00
Richard Henderson	9a9eda78e4	tcg: Add logical simplifications during gvec expand We handle many of these during integer expansion, and the rest of them during integer optimization. Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2019-01-28 07:03:34 -08:00

1 2 3 4 5 ...

66486 Commits