qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Dominik Csapak	ecd7a0d5bb	qmp: Add reason to SHUTDOWN and RESET events This makes it possible to determine what the exact reason was for a RESET or a SHUTDOWN. A management layer might need the specific reason of those events to determine which cleanups or other actions it needs to do. This patch also updates the iotests to the new expected output that includes the reason. Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> Message-Id: <20181205110131.23049-3-d.csapak@proxmox.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [Commit message tweaked] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2018-12-18 07:55:47 +01:00
Dominik Csapak	d43013e24d	qapi: Turn ShutdownCause into QAPI enum Needed so the patch after next can add ShutdownCause to QMP events SHUTDOWN and RESET. Signed-off-by: Dominik Csapak <d.csapak@proxmox.com> Message-Id: <20181205110131.23049-2-d.csapak@proxmox.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2018-12-18 07:55:47 +01:00
Peter Maydell	ec3c927f3d	Hardfloat + maintainers and gitdm -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEZoWumedRZ7yvyN81+9DbCVqeKkQFAlwXgJcACgkQ+9DbCVqe KkTwwwf7BDbROStJi5ZpCQmhXfpN/w0Ol0JO21Ic9jAT4WZq82IU7LFkOuwT9Mx4 Km88ZOpBNE+zRsITgjHNGAR2vLe9VJGzsOzF6/wJfgFILsT0R9gjIyCT0I3uzQjh dgArCNJvVvt2MJQ0tLYiGX8+oIXpMrHFA2t1XwTrm2vRrM3F7ZrayucuEBop9Qko o5+HeJ8Jdp3vJpMMXZw7moXP5ZpizlmpaaCQAWY3IeFWlZdE9W7ctHQRuKG45m09 ab69IfEdnKqqjmlqIcta4wute9pBNeXqeOsdyfF4NPq1Vecgv1GMywBA+hiFBpYx b4eDCZl7LtNwjGnGf4geOUQzaCM1Uw== =+aD7 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/stsquad/tags/pull-hardfloat-and-gitdm-171218-3' into staging Hardfloat + maintainers and gitdm # gpg: Signature made Mon 17 Dec 2018 10:55:19 GMT # gpg: using RSA key FBD0DB095A9E2A44 # gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>" # Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44 * remotes/stsquad/tags/pull-hardfloat-and-gitdm-171218-3: hardfloat: implement float32/64 comparison hardfloat: implement float32/64 square root hardfloat: implement float32/64 fused multiply-add hardfloat: implement float32/64 division hardfloat: implement float32/64 multiplication hardfloat: implement float32/64 addition and subtraction fpu: introduce hardfloat tests/fp: add fp-bench softfloat: add float{32,64}_is_zero_or_normal softfloat: rename canonicalize to sf_canonicalize target/tricore: use float32_is_denormal softfloat: add float{32,64}_is_{de,}normal fp-test: pick TARGET_ARM to get its specialization MAINTAINERS: update status of FPU emulation contrib: add a basic gitdm config Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-12-17 23:46:05 +00:00
Philippe Mathieu-Daudé	fe17cca6bd	tests/bios-tables-test: Sanitize test verbose output Fix the extraneous extra blank lines in the test output when running with V=1. Before: TEST: tests/bios-tables-test... (pid=25678) /i386/acpi/piix4: Looking for expected file 'tests/acpi-test-data/pc/DSDT' Using expected file 'tests/acpi-test-data/pc/DSDT' Looking for expected file 'tests/acpi-test-data/pc/FACP' Using expected file 'tests/acpi-test-data/pc/FACP' Looking for expected file 'tests/acpi-test-data/pc/APIC' Using expected file 'tests/acpi-test-data/pc/APIC' Looking for expected file 'tests/acpi-test-data/pc/HPET' Using expected file 'tests/acpi-test-data/pc/HPET' OK After: TEST: tests/bios-tables-test... (pid=667) /i386/acpi/piix4: Looking for expected file 'tests/acpi-test-data/pc/DSDT' Using expected file 'tests/acpi-test-data/pc/DSDT' Looking for expected file 'tests/acpi-test-data/pc/FACP' Using expected file 'tests/acpi-test-data/pc/FACP' Looking for expected file 'tests/acpi-test-data/pc/APIC' Using expected file 'tests/acpi-test-data/pc/APIC' Looking for expected file 'tests/acpi-test-data/pc/HPET' Using expected file 'tests/acpi-test-data/pc/HPET' OK Suggested-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 16:21:51 +01:00
Igor Mammedov	da15af6497	tests: acpi: remove not used ACPI_READ_GENERIC_ADDRESS macro Signed-off-by: Igor Mammedov <imammedo@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> [thuth: Fixed conflicts with additional "qts" parameter] Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 15:37:55 +01:00
Richard Henderson	21f80286cc	tests: Exit boot-serial-test loop if child dies There's no point in waiting 5 full minutes when there will be no more output. Compute timeout based on elapsed wall clock time instead of N * delays, as the delay is a minimum sleep time. Cc: Thomas Huth <thuth@redhat.com> Cc: Laurent Vivier <lvivier@redhat.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Wainer dos Santos Moschetta <wainersm@redhat.com> [thuth: Replaced global_qtest with local qts variable] Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 15:37:50 +01:00
Thomas Huth	43497c438d	tests/pxe: Make test independent of global_qtest global_qtest is not really required here, since boot_sector_test() is already independent from that global variable. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 15:36:40 +01:00
Thomas Huth	dc4c158722	tests/prom-env: Make test independent of global_qtest global_qtest is only needed here for one readl(). Let's replace it with qtest_readl() and we can remove the global_qtest variable here. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 15:36:40 +01:00
Thomas Huth	ed398a1206	tests/machine-none: Make test independent of global_qtest Apart from using qmp() in one spot, this test does not have any dependencies to the global_qtest variable, so we can simply get rid of it here by replacing the qmp() with qtest_qmp(). Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 15:36:40 +01:00
Thomas Huth	a2569b001c	tests/test-filter: Make tests independent of global_qtest Apart from using qmp() in the qmp_discard_response() macro, these tests do not have any dependencies to the global_qtest variable, so we can simply get rid of it here by replacing the qmp() with qtest_qmp() in the macro. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 15:36:40 +01:00
Thomas Huth	e6426b7419	tests/boot-serial: Get rid of global_qtest variable The test does not use any of the functions that require global_qtest, so we can simply get rid of this global variable here. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 15:36:40 +01:00
Thomas Huth	791a289bad	tests/pvpanic: Make the pvpanic test independent of global_qtest We want to get rid of global_qtest in the long run, thus do not use the wrappers like inb() and outb() here anymore. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 15:36:40 +01:00
Thomas Huth	ac16ab753a	tests/vmgenid: Make test independent of global_qtest The biggest part has already been done in the previous patch, we now only have to replace some few qmp() and readb() calls with the corresponding qtest_*() functions to get there. Acked-by: Igor Mammedov <imammedo@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 15:36:39 +01:00
Eric Blake	273e3d92cf	tests/acpi-utils: Drop dependence on global_qtest As a general rule, we prefer avoiding implicit global state because it makes code harder to safely copy and paste without thinking about the global state. Adjust the helper code to use explicit state instead, and update all callers. bios-tables-test no longer depends on global_qtest, now that it passes explicit state through the testsuite data; an assert proves this fact (although we will get rid of it later, once global_qtest is gone). Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Tested-by: Igor Mammedov <imammedo@redhat.com> [thuth: adapted patch to current master branch] Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 15:36:39 +01:00
Eric Blake	24c01ffa9d	ivshmem-test: Drop dependence on global_qtest Managing parallel connections to two different monitors via the implicit global_qtest makes it hard to copy-and-paste code to tests that are not aware of the implicit state. Since we have already fixed qpci to avoid global_qtest, we can now simplify by not using global_qtest anywhere in ivshmem-test. We can assert that the conversion is correct by checking that global_qtest remains NULL throughout the test (a later patch that changes global_qtest to not be a public global variable will drop the assertions). Signed-off-by: Eric Blake <eblake@redhat.com> [thuth: Dropped the changes to test_ivshmem_hotplug() - will be fixed later] Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 15:36:39 +01:00
Thomas Huth	d786f78252	tests/libqos/pci: Make PCI access functions independent of global_qtest QPCIBus already tracks QTestState, so use that state instead of an implicit reliance on global_qtest. Based on an earlier patch ("libqos: Use explicit QTestState for pci operations") from Eric Blake. Signed-off-by: Thomas Huth <thuth@redhat.com>	2018-12-17 15:36:39 +01:00
Peter Maydell	f163448536	- Remove retranslation remenents - Return success from patch_reloc - Preserve 32-bit values as zero-extended on x86_64 - Make bswap during memory ops as optional - Cleanup xxhash - Revert constant pooling for tcg/sparc/ -----BEGIN PGP SIGNATURE----- iQEcBAABAgAGBQJcFxchAAoJEGTfOOivfiFfBUcIALmEeTTRkDtY8rCX0Thegd6g O9roAEHvSu2BS3Zd3EwA+mu5OxcL8WeZY2LYBodFlCCsl/yQ09Lv7QmxrGtX7WNx VF96BftTxYFGVC3Xc6+Q16/dSYM4qcWLuDxAE9BAh47m9NvTjPq+9ntEJMlalIDh My8ANyGByBZeUeBXJuNReJcsGP5eUmNyuaM+aOlMjcVJeFAtvFacwkKpJdLPDM53 feDEiKhRWCkZq1ll4yFtuVTc+dQeYfLnPk8bkJcv7UAJnYIveXZk/eJcs5/vYjCx 8aePb9PwjbYrgXJgbo8mgVhgLBmakObQa8lJvlc3IZfIMp8OK/6au3TDXDSQAts= =4Kdn -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20181216' into staging - Remove retranslation remenents - Return success from patch_reloc - Preserve 32-bit values as zero-extended on x86_64 - Make bswap during memory ops as optional - Cleanup xxhash - Revert constant pooling for tcg/sparc/ # gpg: Signature made Mon 17 Dec 2018 03:25:21 GMT # gpg: using RSA key 64DF38E8AF7E215F # gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" # Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F * remotes/rth/tags/pull-tcg-20181216: (33 commits) xxhash: match output against the original xxhash32 include: move exec/tb-hash-xx.h to qemu/xxhash.h exec: introduce qemu_xxhash{2,4,5,6,7} qht-bench: document -p flag tcg: Drop nargs from tcg_op_insert_{before,after} tcg/mips: Improve the add2/sub2 command to use TCG_TARGET_REG_BITS tcg: Add TCG_TARGET_HAS_MEMORY_BSWAP tcg/optimize: Optimize bswap tcg: Clean up generic bswap64 tcg: Clean up generic bswap32 tcg/i386: Add setup_guest_base_seg for FreeBSD tcg/i386: Precompute all guest_base parameters tcg/i386: Assume 32-bit values are zero-extended tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests tcg/i386: Propagate is64 to tcg_out_qemu_ld_slow_path tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct tcg/s390x: Return false on failure from patch_reloc tcg/ppc: Return false on failure from patch_reloc tcg/arm: Return false on failure from patch_reloc tcg/aarch64: Return false on failure from patch_reloc ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-12-17 13:04:25 +00:00
Alex Bennée	139108f684	.shippable.yml: disable the win cross tests The pkg.mxe.cc package repositories have been down for the last two weeks causing the builds to fail when shippable re-builds the containers. This is really just a sticking plaster until we can get our own docker hub images properly setup so we can avoid having dependencies on external repos. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Acked-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-id: 20181214151718.5041-1-alex.bennee@linaro.org Cc: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-12-17 13:02:12 +00:00
Emilio G. Cota	d9fe9db943	hardfloat: implement float32/64 comparison Performance results for fp-bench: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: cmp-single: 110.98 MFlops cmp-double: 107.12 MFlops - after: cmp-single: 506.28 MFlops cmp-double: 524.77 MFlops Note that flattening both eq and eq_signaling versions would give us extra performance (695v506, 615v524 Mflops for single/double, respectively) but this would emit two essentially identical functions for each eq/signaling pair, which is a waste. Aggregate performance improvement for the last few patches: [ all charts in png: https://imgur.com/a/4yV8p ] 1. Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz qemu-aarch64 NBench score; higher is better Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz 16 +-+-----------+-------------+----===-------+---===-------+-----------+-+ 14 +-+..........................@@@&&.=.......@@@&&.=...................+-+ 12 +-+..........................@.@.&.=.......@.@.&.=.....+befor=== +-+ 10 +-+..........................@.@.&.=.......@.@.&.=.....+ad@@&& = +-+ 8 +-+.......................$$$%.@.&.=.......@.@.&.=.....+ @@u& = +-+ 6 +-+............@@@&&=+*##.$%.@.&.=##$$%+@.&.=..###$$%%@i& = +-+ 4 +-+.......###$%%.@.&=...#.$%.@.&.=..#.$%.@.&.=+.#+$ +@m& = +-+ 2 +-+.....*.#$.%.@.&=...#.$%.@.&.=..#.$%.@.&.=..#+$+sqr& = +-+ 0 +-+-----##$%%@@&&=-##$$%@@&&==##$$%@@&&==-##$$%+cmp==-----+-+ FOURIER NEURAL NELU DECOMPOSITION gmean qemu-aarch64 SPEC06fp (test set) speedup over QEMU `4c2c101590` Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz error bars: 95% confidence interval 4.5 +-+---+-----+----+-----+-----+-&---+-----+----+-----+-----+-----+----+-----+-----+-----+-----+----+-----+---+-+ 4 +-+..........................+@@+...........................................................................+-+ 3.5 +-+..............%%@&.........@@..............%%@&............................................+++dsub +-+ 2.5 +-+....&&+.......%%@&.......+%%@..+%%&+..@@&+.%%@&....................................+%%&+.+%@&++%%@& +-+ 2 +-+..+%%&..+%@&+.%%@&...+++..%%@...%%&.+$$@&..%%@&..%%@&.......+%%&+.%%@&+......+%%@&.+%%&++$$@&++d%@& %%@&+-+ 1.5 +-+#$%&#$@&#%@&$%@#$%@#$%&#$@&$%@&#$%@#$%@#$%&#%@&$%@&#$%@#$%&#$@&+f%@&$%@&+-+ 0.5 +-+#$%&#$@&#%@&$%@#$%@#$%&#$@&$%@&#$%@#$%@#$%&#%@&$%@&#$%@#$%&#$@&+sqr@&$%@&+-+ 0 +-+#$%&#$@&#%@&$%@#$%@#$%&#$@&$%@&#$%@#$%@#$%&#%@&$%@&#$%@#$%&#$@&+cmp&$%@&+-+ 410.bw416.gam433.434.z435.436.cac437.lesli444.447.de450.so453454.ca459.GemsF465.tont470.lb4482.sphinxgeomean 2. Host: ARM Aarch64 A57 @ 2.4GHz qemu-aarch64 NBench score; higher is better Host: Applied Micro X-Gene, Aarch64 A57 @ 2.4 GHz 5 +-+-----------+-------------+-------------+-------------+-----------+-+ 4.5 +-+........................................@@@&==...................+-+ 3 4 +-+..........................@@@&==........@.@&.=.....+before +-+ 3 +-+..........................@.@&.=........@.@&.=.....+ad@@@&== +-+ 2.5 +-+.....................##$$%%.@&.=........@.@&.=.....+ @m@& = +-+ 2 +-+............@@@&==.#.$.%.@&.=.#$$%%.@&.=.#$$%%d@& = +-+ 1.5 +-+.....*#$$%%.@&.=..#.$.%.@&.=..#.$.%.@&.=..#+$ +f@& = +-+ 0.5 +-+......#.$.%.@&.=..#.$.%.@&.=..#.$.%.@&.=..#+$+sqr& = +-+ 0 +-+-----#$$%%@@&==-#$$%%@@&==-#$$%%@@&==-*#$$%+cmp==-----+-+ FOURIER NEURAL NLU DECOMPOSITION gmean Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	f131bae8a7	hardfloat: implement float32/64 square root Performance results for fp-bench: Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: sqrt-single: 42.30 MFlops sqrt-double: 22.97 MFlops - after: sqrt-single: 311.42 MFlops sqrt-double: 311.08 MFlops Here USE_FP makes a huge difference for f64's, with throughput going from ~200 MFlops to ~300 MFlops. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	ccf770ba73	hardfloat: implement float32/64 fused multiply-add Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: fma-single: 74.73 MFlops fma-double: 74.54 MFlops - after: fma-single: 203.37 MFlops fma-double: 169.37 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: fma-single: 23.24 MFlops fma-double: 23.70 MFlops - after: fma-single: 66.14 MFlops fma-double: 63.10 MFlops 3. IBM POWER8E @ 2.1 GHz - before: fma-single: 37.26 MFlops fma-double: 37.29 MFlops - after: fma-single: 48.90 MFlops fma-double: 59.51 MFlops Here having 3FP64 set to 1 pays off for x86_64: [1] 170.15 vs [0] 153.12 MFlops Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	4a6295613f	hardfloat: implement float32/64 division Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: div-single: 34.84 MFlops div-double: 34.04 MFlops - after: div-single: 275.23 MFlops div-double: 216.38 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: div-single: 9.33 MFlops div-double: 9.30 MFlops - after: div-single: 51.55 MFlops div-double: 15.09 MFlops 3. IBM POWER8E @ 2.1 GHz - before: div-single: 25.65 MFlops div-double: 24.91 MFlops - after: div-single: 96.83 MFlops div-double: 31.01 MFlops Here setting 2FP64_USE_FP to 1 pays off for x86_64: [1] 215.97 vs [0] 62.15 MFlops Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	2dfabc86e6	hardfloat: implement float32/64 multiplication Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: mul-single: 126.91 MFlops mul-double: 118.28 MFlops - after: mul-single: 258.02 MFlops mul-double: 197.96 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: mul-single: 37.42 MFlops mul-double: 38.77 MFlops - after: mul-single: 73.41 MFlops mul-double: 76.93 MFlops 3. IBM POWER8E @ 2.1 GHz - before: mul-single: 58.40 MFlops mul-double: 59.33 MFlops - after: mul-single: 60.25 MFlops mul-double: 94.79 MFlops Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	1b615d4820	hardfloat: implement float32/64 addition and subtraction Performance results (single and double precision) for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: add-single: 135.07 MFlops add-double: 131.60 MFlops sub-single: 130.04 MFlops sub-double: 133.01 MFlops - after: add-single: 443.04 MFlops add-double: 301.95 MFlops sub-single: 411.36 MFlops sub-double: 293.15 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: add-single: 44.79 MFlops add-double: 49.20 MFlops sub-single: 44.55 MFlops sub-double: 49.06 MFlops - after: add-single: 93.28 MFlops add-double: 88.27 MFlops sub-single: 91.47 MFlops sub-double: 88.27 MFlops 3. IBM POWER8E @ 2.1 GHz - before: add-single: 72.59 MFlops add-double: 72.27 MFlops sub-single: 75.33 MFlops sub-double: 70.54 MFlops - after: add-single: 112.95 MFlops add-double: 201.11 MFlops sub-single: 116.80 MFlops sub-double: 188.72 MFlops Note that the IBM and ARM machines benefit from having HARDFLOAT_2F{32,64}_USE_FP set to 0. Otherwise their performance can suffer significantly: - IBM Power8: add-single: [1] 54.94 vs [0] 116.37 MFlops add-double: [1] 58.92 vs [0] 201.44 MFlops - Aarch64 A57: add-single: [1] 80.72 vs [0] 93.24 MFlops add-double: [1] 82.10 vs [0] 88.18 MFlops On the Intel machine, having 2F64 set to 1 pays off, but it doesn't for 2F32: - Intel i7-6700K: add-single: [1] 285.79 vs [0] 426.70 MFlops add-double: [1] 302.15 vs [0] 278.82 MFlops Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	a94b783952	fpu: introduce hardfloat The appended paves the way for leveraging the host FPU for a subset of guest FP operations. For most guest workloads (e.g. FP flags aren't ever cleared, inexact occurs often and rounding is set to the default [to nearest]) this will yield sizable performance speedups. The approach followed here avoids checking the FP exception flags register. See the added comment for details. This assumes that QEMU is running on an IEEE754-compliant FPU and that the rounding is set to the default (to nearest). The implementation-dependent specifics of the FPU should not matter; things like tininess detection and snan representation are still dealt with in soft-fp. However, this approach will break on most hosts if we compile QEMU with flags that break IEEE compatibility. There is no way to detect all of these flags at compilation time, but at least we check for -ffast-math (which defines __FAST_MATH__) and disable hardfloat (plus emit a #warning) when it is set. This patch just adds common code. Some operations will be migrated to hardfloat in subsequent patches to ease bisection. Note: some architectures (at least PPC, there might be others) clear the status flags passed to softfloat before most FP operations. This precludes the use of hardfloat, so to avoid introducing a performance regression for those targets, we add a flag to disable hardfloat. In the long run though it would be good to fix the targets so that at least the inexact flag passed to softfloat is indeed sticky. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	25f539f359	tests/fp: add fp-bench These microbenchmarks will allow us to measure the performance impact of FP emulation optimizations. Note that we can measure both directly the impact on the softfloat functions (with "-t soft"), or the impact on an emulated workload (call with "-t host" and run under qemu user-mode). Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	315df0d193	softfloat: add float{32,64}_is_zero_or_normal These will gain some users very soon. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	f9943c7f76	softfloat: rename canonicalize to sf_canonicalize glibc >= 2.25 defines canonicalize in commit eaf5ad0 (Add canonicalize, canonicalizef, canonicalizel., 2016-10-26). Given that we'll be including <math.h> soon, prepare for this by prefixing our canonicalize() with sf_ to avoid clashing with the libc's canonicalize(). Reported-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	b8c547000d	target/tricore: use float32_is_denormal Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	588e6dfd87	softfloat: add float{32,64}_is_{de,}normal This paves the way for upcoming work. Reviewed-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Emilio G. Cota	6c49b06dfd	fp-test: pick TARGET_ARM to get its specialization This gets rid of the muladd errors due to not raising the invalid flag. - Before: Errors found in f64_mulAdd, rounding near_even, tininess before rounding: +000.0000000000000 +7FF.0000000000000 +7FF.FFFFFFFFFFFFF => +7FF.FFFFFFFFFFFFF ..... expected -7FF.FFFFFFFFFFFFF v.... [...] - After: In 6133248 tests, no errors found in f64_mulAdd, rounding near_even, tininess before rounding. [...] Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Alex Bennée <alex.bennee@linaro.org>	2018-12-17 08:25:25 +00:00
Alex Bennée	0636e4d899	MAINTAINERS: update status of FPU emulation Given I've spent a fair amount of time around this code now I'm putting myself forward as a maintainer. Also given that the code has been extensively re-written and has testing and new incoming features it is probably more than just Odd Fixes. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 08:25:25 +00:00
Alex Bennée	2f28271d80	contrib: add a basic gitdm config This is a QEMU specific version of a gitdm config for generating reports on the contributor base of the project. I've added enough group maps and domain aliases to ensure the current top ten is as reflective as it can be. As of this commit running: git log --numstat --since "Last Year" \| gitdm -n -l 10 Reports: Top changeset contributors by employer Red Hat 3172 (44.3%) Linaro 1153 (16.1%) (None) 549 (7.7%) IBM 348 (4.9%) Academics (various) 170 (2.4%) Virtuozzo 168 (2.3%) Wave Computing 118 (1.6%) Xilinx 102 (1.4%) Igalia 93 (1.3%) Cadence Design Systems 88 (1.2%) Top lines changed by employer Red Hat 144092 (28.1%) Cadence Design Systems 126554 (24.6%) Linaro 77480 (15.1%) Wave Computing 33134 (6.5%) SiFive 14392 (2.8%) IBM 12219 (2.4%) (None) 11948 (2.3%) Academics (various) 10447 (2.0%) Virtuozzo 10445 (2.0%) CodeWeavers 9179 (1.8%) Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Aleksandar Markovic <amarkovic@wavecomp.com>	2018-12-17 08:25:10 +00:00
Emilio G. Cota	b7c2cd08a6	xxhash: match output against the original xxhash32 Change the order in which we extract a/b and c/d to match the output of the upstream xxhash32. Tested with: https://github.com/cota/xxhash/tree/qemu Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Emilio G. Cota	fe656e3185	include: move exec/tb-hash-xx.h to qemu/xxhash.h Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Emilio G. Cota	c971d8fa73	exec: introduce qemu_xxhash{2,4,5,6,7} Before moving them all to include/qemu/xxhash.h. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Emilio G. Cota	e132fde25f	qht-bench: document -p flag Which we forgot to do in `bd224fce60` ("qht-bench: add -p flag to precompute hash values", 2018-09-26). Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Emilio G. Cota	ac1043f6d6	tcg: Drop nargs from tcg_op_insert_{before,after} It's unused since `75e8b9b7aa`. Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181209193749.12277-9-cota@braap.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Alistair Francis	161dec9d1b	tcg/mips: Improve the add2/sub2 command to use TCG_TARGET_REG_BITS Instead of hard coding 31 for the shift right use TCG_TARGET_REG_BITS - 1. Signed-off-by: Alistair Francis <alistair.francis@wdc.com> Message-Id: <7dfbddf7014a595150aa79011ddb342c3cc17ec3.1544648105.git.alistair.francis@wdc.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	e1dcf3529d	tcg: Add TCG_TARGET_HAS_MEMORY_BSWAP For now, defined universally as true, since we previously required backends to implement swapped memory operations. Future patches may now remove that support where it is onerous. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	6498594c8e	tcg/optimize: Optimize bswap Somehow we forgot these operations, once upon a time. This will allow immediate stores to have their bswap optimized away. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	9e821eab0a	tcg: Clean up generic bswap64 Based on the only current user, Sparc: New code uses 2 constants that take 2 insns to load from constant pool, plus 13. Old code used 6 constants that took 1 or 2 insns to create, plus 21. The result is a new total of 17 vs an old total of 29. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	a686dc71d8	tcg: Clean up generic bswap32 Based on the only current user, Sparc: New code uses 1 constant that takes 2 insns to create, plus 8. Old code used 2 constants that took 2 insns to create, plus 9. The result is a new total of 10 vs an old total of 13. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	5785c17f31	tcg/i386: Add setup_guest_base_seg for FreeBSD Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	913c2bddc2	tcg/i386: Precompute all guest_base parameters These values are constant between all qemu_ld/st invocations; there is no need to figure this out each time. If we cannot use a segment or an offset directly for guest_base, load the value into a register in the prologue. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	4810d96f03	tcg/i386: Assume 32-bit values are zero-extended We now have an invariant that all TCG_TYPE_I32 values are zero-extended, which means that we do not need to extend them again during qemu_ld/st, either explicitly via a separate tcg_out_ext32u or implicitly via P_ADDR32. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	75478279a0	tcg/i386: Implement INDEX_op_extr{lh}_i64_i32 for 32-bit guests This preserves the invariant that all TCG_TYPE_I32 values are zero-extended in the 64-bit host register. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	3dbc8c61de	tcg/i386: Propagate is64 to tcg_out_qemu_ld_slow_path This helps preserve the invariant that all TCG_TYPE_I32 values are stored zero-extended in the 64-bit host registers. Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:44 +03:00
Richard Henderson	1d21d95b61	tcg/i386: Propagate is64 to tcg_out_qemu_ld_direct This helps preserve the invariant that all TCG_TYPE_I32 values are stored zero-extended in the 64-bit host registers. Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00
Richard Henderson	55dfd8fedc	tcg/s390x: Return false on failure from patch_reloc This does require an extra two checks within the slow paths to replace the assert that we're moving. Also add two checks within existing functions that lacked any kind of assert for out of range branch. Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-12-17 06:04:43 +03:00

1 2 3 4 5 ...

65579 Commits All Branches Search

65579 Commits

All Branches