qemu-e2k

Author	SHA1	Message	Date
Roman Bolshakov	b4e1af8961	i386: hvf: Fix register refs if REX is present According to Intel(R)64 and IA-32 Architectures Software Developer's Manual, the following one-byte registers should be fetched when REX prefix is present (sorted by reg encoding index): AL, CL, DL, BL, SPL, BPL, SIL, DIL, R8L - R15L The first 8 are fetched if REX.R is zero, the last 8 if non-zero. The following registers should be fetched for instructions without REX prefix (also sorted by reg encoding index): AL, CL, DL, BL, AH, CH, DH, BH Current emulation code doesn't handle accesses to SPL, BPL, SIL, DIL when REX is present, thefore an instruction 40883e "mov %dil,(%rsi)" is decoded as "mov %bh,(%rsi)". That caused an infinite loop in vp_reset: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg03293.html Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Message-Id: <20181018134401.44471-1-r.bolshakov@yadro.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:12 +02:00
Vitaly Kuznetsov	6b7a98303b	i386/kvm: add support for Hyper-V IPI send Hyper-V PV IPI support is merged to KVM, enable the feature in Qemu. When enabled, this allows Windows guests to send IPIs to other vCPUs with a single hypercall even when there are >64 vCPUs in the request. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Roman Kagan <rkagan@virtuozzo.com> Message-Id: <20181009130853.6412-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:12 +02:00
Pavel Dovgalyuk	ca9759c2a9	replay: don't process events at virtual clock checkpoint As QEMU becomes more multi-threaded and non-synchronized, checkpoints move from thread to thread. And the event queue that processed at checkpoints should belong to the same thread in both record and replay executions. This patch disables asynchronous event processing at virtual clock checkpoint, because it may be invoked in different threads at record and replay. This patch is temporary fix until the checkpoints are completely refactored. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Message-Id: <20181018063345.7433.11678.stgit@pasha-VirtualBox> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:12 +02:00
Peng Hao	a8de011500	target-i386: add q35 0xcf8 port as coalesced_pio Signed-off-by: Peng Hao <peng.hao2@zte.com.cn> Message-Id: <1539795177-21038-6-git-send-email-peng.hao2@zte.com.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:11 +02:00
Peng Hao	37abf8d234	target-i386: add i440fx 0xcf8 port as coalesced_pio Signed-off-by: Peng Hao <peng.hao2@zte.com.cn> Message-Id: <1539795177-21038-5-git-send-email-peng.hao2@zte.com.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:11 +02:00
Peng Hao	f98167ea06	target-i386: add rtc 0x70 port as coalesced_pio Signed-off-by: Peng Hao <peng.hao2@zte.com.cn> Message-Id: <1539890353-30273-1-git-send-email-peng.hao2@zte.com.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:11 +02:00
Peng Hao	e6d34aeea6	target-i386 : add coalesced_pio API the primary API realization. Signed-off-by: Peng Hao <peng.hao2@zte.com.cn> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <1539795177-21038-3-git-send-email-peng.hao2@zte.com.cn> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:11 +02:00
Paolo Bonzini	966f2ec3ac	linux-headers: update to 4.20-rc1 This brings in eVMCS and coalesced PIO support, as well as other features we do not support yet. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:11 +02:00
Paolo Bonzini	b31c003895	target-i386: kvm: do not initialize padding fields The exception.pad field is going to be renamed to pending in an upcoming header file update. Remove the unnecessary initialization; it was introduced to please valgrind (commit `7e680753cf`) but they were later rendered unnecessary by commit `076796f8fd`, which added the "= {}" initializer to the declaration of "events". Therefore the patch does not change behavior in any way. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:04 +02:00
Artem Pisarenko	e81f86790f	qemu-timer: avoid checkpoints for virtual clock timers in external subsystems Adds EXTERNAL attribute definition to qemu timers subsystem and assigns it to virtual clock timers, used in slirp (ICMP IPv6) and ui (key queue). Virtual clock processing in rr mode can use this attribute instead of a separate clock type. Fixes: `87f4fe7653` Fixes: `775a412bf8` Fixes: `9888091404` Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com> Message-Id: <e771f96ab94e86b54b9a783c974f2af3009fe5d1.1539764043.git.artem.k.pisarenko@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:03 +02:00
Artem Pisarenko	89a603a0c8	qemu-timer: introduce timer attributes Attributes are simple flags, associated with individual timers for their whole lifetime. They intended to be used to mark individual timers for special handling when they fire. New/init functions family in timer interface updated and refactored (new 'attribute' argument added, timer_list replaced with timer_list_group+type combinations, comments improved to avoid info duplication). Also existing aio interface extended with attribute-enabled variants of functions, which create/initialize timers. Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com> Message-Id: <f47b81dbce734e9806f9516eba8ca588e6321c2f.1539764043.git.artem.k.pisarenko@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:03 +02:00
Artem Pisarenko	05ff8dc32f	Revert some patches from recent [PATCH v6] "Fixing record/replay and adding reverse debugging" That patch series introduced new virtual clock type for use in external subsystems. It breaks desired behavior in non-record/replay usage scenarios due to a small change to existing behavior. Processing of virtual timers belonging to new clock type is kicked off to the main loop, which makes these timers asynchronous with vCPU thread and, in icount mode, with whole guest execution. This breaks expected determinism in non-record/replay icount mode of emulation where these "external subsystems" are isolated from the host (i.e. they are external only to guest core, not to the entire emulation environment). Example for slirp ("user" backend for network device): User runs qemu in icount mode with rtc clock=vm without any external communication interfaces but with "-netdev user,restrict=on". It expects deterministic execution, because network services are emulated inside qemu and isolated from host. There are no reasons to get reply from DHCP server with different delay or something like that. The next patches revert reimplements the same changes in a better way. This reverts commit `87f4fe7653`. This reverts commit `775a412bf8`. This reverts commit `9888091404`. Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com> Message-Id: <18b1e7c8f155fe26976f91be06bde98eef6f8751.1539764043.git.artem.k.pisarenko@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:03 +02:00
Paolo Bonzini	24f7973b67	es1370: more fixes for ADC_FRAMEADR and ADC_FRAMECNT They are not consecutive with DAC1_FRAME* and DAC2_FRAME*; Coverity still complains about es1370_read, while es1370_write was fixed in commit `cf9270e522`. Fixes: `154c1d1f96` Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2018-10-19 13:44:00 +02:00
Daniel P. Berrangé	dea7a64e4c	crypto: require libgcrypt >= 1.5.0 for building QEMU libgcrypt 1.5.0 was released in 2011 and all the distros that are build target platforms for QEMU [1] include it: RHEL-7: 1.5.3 Debian (Stretch): 1.7.6 Debian (Jessie): 1.6.3 OpenBSD (ports): 1.8.2 FreeBSD (ports): 1.8.3 OpenSUSE Leap 15: 1.8.2 Ubuntu (Xenial): 1.6.5 macOS (Homebrew): 1.8.3 Based on this, it is reasonable to require libgcrypt >= 1.5.0 in QEMU which allows for some conditional version checks in the code to be removed. [1] https://qemu.weilnetz.de/doc/qemu-doc.html#Supported-build-platforms Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2018-10-19 12:26:57 +01:00
Daniel P. Berrangé	a0722409bc	crypto: require gnutls >= 3.1.18 for building QEMU gnutls 3.0.0 was released in 2011 and all the distros that are build target platforms for QEMU [1] include it: RHEL-7: 3.1.18 Debian (Stretch): 3.5.8 Debian (Jessie): 3.3.8 OpenBSD (ports): 3.5.18 FreeBSD (ports): 3.5.18 OpenSUSE Leap 15: 3.6.2 Ubuntu (Xenial): 3.4.10 macOS (Homebrew): 3.5.19 Based on this, it is reasonable to require gnutls >= 3.1.18 in QEMU which allows for all conditional version checks in the code to be removed. [1] https://qemu.weilnetz.de/doc/qemu-doc.html#Supported-build-platforms Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2018-10-19 12:26:57 +01:00
Peter Maydell	1b7490446b	Add a workaround for clang bug and remove misleading comment (sparc) -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJbyNhBAAoJEPMMOL0/L748u/sQALDmpdHXmqgiA9YPYGSg6Yn5 J6TsMs9O+DcgIMmLkYcvHEajJf5R6j5hO4HRnrqefnEaAQHMtoDNxTMqTqyiRyyd rIeokVauBeDrnr88XxRGGDTfyKMp9qR255wjpaueKtRmloHN+EvgQ+a9vgZlqDoi CmpmA05wVYdW2ku3uk5QtrGsfmLsUnT9ETTs+/kU9uoVujnYe+Ix77kDb8BYe1zz OL2aBu5f5LdJKqvbIMsxHg7m32MxG4swLf3gjD6wl5R711Pin9Uidpg7mzVmmElp mUTuSSJtTbqqM15NanQbfXAoBBStM+ILH5juaHjNC5iA8Li1AL2+KVckWELNOJnd 0tbagKS8MAiHw9sExMrREArpqsusJ6YUaHMhlLdtnV+r8YKry1iK1nFS8KIdbY3r 4stL/H7dKfvtSlSA4bF0zcwZwqJMvX5qNKT8fXUV2j2/i6ttQahL/mwqClQDcuFA LkdMCcI+TXvbt04KeYE9eGbWUg2JFFlf2qiX2bD/tUqTDLjFP15YFtpY+3B0FnLW EUdooDKsjlyz562SIm9ccGlyNwKpsSVenUuU4n0tmCq8PVe3S7UyXQH+w+a7mWZZ sOOFc64gTB+8Z8wUdAm6MXCgUoKavfGUPYAIq5cbYcxEgQ8XuMunx5TtDnjC2tZ1 EwPPfdzK8LLaVYxTLkr/ =LSxi -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-3.1-pull-request' into staging Add a workaround for clang bug and remove misleading comment (sparc) # gpg: Signature made Thu 18 Oct 2018 20:00:17 BST # gpg: using RSA key F30C38BD3F2FBE3C # gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" # gpg: aka "Laurent Vivier <laurent@vivier.eu>" # gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" # Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C * remotes/vivier2/tags/linux-user-for-3.1-pull-request: linux-user/sparc/signal.c: Remove unnecessary comment linux-user: Suppress address-of-packed-member warnings in __get/put_user_e Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-10-19 11:20:05 +01:00
Peter Maydell	2ec24af237	MIPS queue October 2018, part1, v2 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEcBAABAgAGBQJbyNNEAAoJENSXKoln91plgPMH/iHilm6MrW7r6tvJloYEBwfZ e8AkQzWcWq6rpWYhbiuGvWY2Qn1EWoTWFfohEvJ96gkJIZCVwO7sTqD2//58tksp wWpgeQwLxRCd+pB6zBMmYkpPD4WNEHGq7RYTzA+0pBIjwTEjdct0POgmLaiXBnFP mE5m6wohyAlxPpLLfluYEPz6cTIm20M191tv9scDoztKJnd/8u5M3j+yP8t2zbFs pRdOk68YsJl2fOKxgmnLsg83VxpoQahEzOgX1Zi396tcISZMCnuK+4GWJJ9MbPMl b1a9TdjHoNjx0v0qzNWk3RzECE7fBnnxpa8p1pil9Ff5MCq9gwFjxHvnCLZzY0s= =uRwj -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-october-2018-part1-v2' into staging MIPS queue October 2018, part1, v2 # gpg: Signature made Thu 18 Oct 2018 19:39:00 BST # gpg: using RSA key D4972A8967F75A65 # gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>" # gpg: WARNING: This key is not certified with a trusted signature! # gpg: There is no indication that the signature belongs to the owner. # Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01 DD75 D497 2A89 67F7 5A65 * remotes/amarkovic/tags/mips-queue-october-2018-part1-v2: (28 commits) target/mips: Add opcodes for nanoMIPS EVA instructions target/mips: Fix misplaced 'break' in handling of NM_SHRA_R_PH target/mips: Fix emulation of microMIPS R6 <SELEQZ\|SELNEZ>.<D\|S> target/mips: Implement hardware page table walker for MIPS32 target/mips: Add reset state for PWSize and PWField registers target/mips: Add CP0 PWCtl register target/mips: Add CP0 PWSize register target/mips: Add CP0 PWField register target/mips: Add CP0 PWBase register target/mips: Add CP0 Config2 to DisasContext target/mips: Improve DSP R2/R3-related naming target/mips: Add availability control for DSP R3 ASE target/mips: Add bit definitions for DSP R3 ASE target/mips: Reorganize bit definitions for insn_flags (ISAs/ASEs flags) target/mips: Increase 'supported ISAs/ASEs' flag holder size target/mips: Add opcode values of MXU ASE target/mips: Add organizational chart of MXU ASE target/mips: Add assembler mnemonics list for MXU ASE target/mips: Add basic description of MXU ASE target/mips: Add a comment before each CP0 register section in cpu.h ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2018-10-19 10:08:31 +01:00
Thomas Huth	37a4442a76	qemu-options: Fix bad "macaddr" property in the documentation When using the "-device" option, the property is called "mac". "macaddr" is only used for the legacy "-net nic" option. Reported-by: Harald Hoyer <harald@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:04 +08:00
Jason Wang	1001cf45a7	e1000: indicate dropped packets in HW counters The e1000 emulation silently discards RX packets if there's insufficient space in the ring buffer. This leads to errors on higher-level protocols in the guest, with no indication about the error cause. This patch increments the "Missed Packets Count" (MPC) and "Receive No Buffers Count" (RNBC) HW counters in this case. As the emulation has no FIFO for buffering packets that can't immediately be pushed to the guest, these two registers are practically equivalent (see 10.2.7.4, 10.2.7.33 in https://www.intel.com/content/www/us/en/embedded/products/networking/82574l-gbe-controller-datasheet.html). On a Linux guest, the register content will be reflected in the "rx_missed_errors" and "rx_no_buffer_count" stats from "ethtool -S", and in the "missed" stat from "ip -s -s link show", giving at least some hint about the error cause inside the guest. If the cause is known, problems like this can often be avoided easily, by increasing the number of RX descriptors in the guest e1000 driver (e.g under Linux, "e1000.RxDescriptors=1024"). The patch also adds a qemu trace message for this condition. Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:04 +08:00
Jason Wang	1592a99470	net: ignore packet size greater than INT_MAX There should not be a reason for passing a packet size greater than INT_MAX. It's usually a hint of bug somewhere, so ignore packet size greater than INT_MAX in qemu_deliver_packet_iov() CC: qemu-stable@nongnu.org Reported-by: Daniel Shapira <daniel@twistlock.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:04 +08:00
Jason Wang	b1d80d12c5	pcnet: fix possible buffer overflow In pcnet_receive(), we try to assign size_ to size which converts from size_t to integer. This will cause troubles when size_ is greater INT_MAX, this will lead a negative value in size and it can then pass the check of size < MIN_BUF_SIZE which may lead out of bound access for both buf and buf1. Fixing by converting the type of size to size_t. CC: qemu-stable@nongnu.org Reported-by: Daniel Shapira <daniel@twistlock.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:04 +08:00
Jason Wang	1a326646fe	rtl8139: fix possible out of bound access In rtl8139_do_receive(), we try to assign size_ to size which converts from size_t to integer. This will cause troubles when size_ is greater INT_MAX, this will lead a negative value in size and it can then pass the check of size < MIN_BUF_SIZE which may lead out of bound access of for both buf and buf1. Fixing by converting the type of size to size_t. CC: qemu-stable@nongnu.org Reported-by: Daniel Shapira <daniel@twistlock.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:04 +08:00
Jason Wang	fdc89e90fa	ne2000: fix possible out of bound access in ne2000_receive In ne2000_receive(), we try to assign size_ to size which converts from size_t to integer. This will cause troubles when size_ is greater INT_MAX, this will lead a negative value in size and it can then pass the check of size < MIN_BUF_SIZE which may lead out of bound access of for both buf and buf1. Fixing by converting the type of size to size_t. CC: qemu-stable@nongnu.org Reported-by: Daniel Shapira <daniel@twistlock.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:04 +08:00
liujunjie	7da2d99fb9	clean up callback when del virtqueue Before, we did not clear callback like handle_output when delete the virtqueue which may result be segmentfault. The scene is as follows: 1. Start a vm with multiqueue vhost-net, 2. then we write VIRTIO_PCI_GUEST_FEATURES in PCI configuration to triger multiqueue disable in this vm which will delete the virtqueue. In this step, the tx_bh is deleted but the callback virtio_net_handle_tx_bh still exist. 3. Finally, we write VIRTIO_PCI_QUEUE_NOTIFY in PCI configuration to notify the deleted virtqueue. In this way, virtio_net_handle_tx_bh will be called and qemu will be crashed. Although the way described above is uncommon, we had better reinforce it. CC: qemu-stable@nongnu.org Signed-off-by: liujunjie <liujunjie23@huawei.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	8e640892ec	docs: Add COLO status diagram to COLO-FT.txt This diagram make user better understand COLO. Suggested by Markus Armbruster. Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
zhanghailiang	2518aec192	COLO: quick failover process by kick COLO thread COLO thread may sleep at qemu_sem_wait(&s->colo_checkpoint_sem), while failover works begin, It's better to wakeup it to quick the process. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
zhanghailiang	7b3435309d	COLO: notify net filters about checkpoint/failover event Notify all net filters about the checkpoint and failover event. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	24525e93c1	filter-rewriter: handle checkpoint and failover event After one round of checkpoint, the states between PVM and SVM become consistent, so it is unnecessary to adjust the sequence of net packets for old connections, besides, while failover happens, filter-rewriter will into failover mode that needn't handle the new TCP connection. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	5fbba3d659	filter: Add handle_event method for NetFilterClass Filter needs to process the event of checkpoint/failover or other event passed by COLO frame. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
zhanghailiang	d1955d2219	COLO: flush host dirty ram from cache Don't need to flush all VM's ram from cache, only flush the dirty pages since last checkpoint Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	3f6df99d9d	savevm: split the process of different stages for loadvm/savevm There are several stages during loadvm/savevm process. In different stage, migration incoming processes different types of sections. We want to control these stages more accuracy, it will benefit COLO performance, we don't have to save type of QEMU_VM_SECTION_START sections everytime while do checkpoint, besides, we want to separate the process of saving/loading memory and devices state. So we add three new helper functions: qemu_load_device_state() and qemu_savevm_live_state() to achieve different process during migration. Besides, we make qemu_loadvm_state_main() and qemu_save_device_state() public, and simplify the codes of qemu_save_device_state() by calling the wrapper qemu_savevm_state_header(). Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	f56c0065b8	qapi: Add new command to query colo status Libvirt or other high level software can use this command query colo status. You can test this command like that: {'execute':'query-colo-status'} Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	41b6b77921	qapi/migration.json: Rename COLO unknown mode to none mode. Suggested by Markus Armbruster rename COLO unknown mode to none mode. Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
zhanghailiang	9ecff6d66e	qmp event: Add COLO_EXIT event to notify users while exited COLO If some errors happen during VM's COLO FT stage, it's important to notify the users of this event. Together with 'x-colo-lost-heartbeat', Users can intervene in COLO's failover work immediately. If users don't want to get involved in COLO's failover verdict, it is still necessary to notify users that we exited COLO mode. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	e6f4aa188c	COLO: Flush memory data from ram cache During the time of VM's running, PVM may dirty some pages, we will transfer PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint time. So, the content of SVM's RAM cache will always be same with PVM's memory after checkpoint. Instead of flushing all content of PVM's RAM cache into SVM's MEMORY, we do this in a more efficient way: Only flush any page that dirtied by PVM since last checkpoint. In this way, we can ensure SVM's memory same with PVM's. Besides, we must ensure flush RAM cache before load device state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	7d9acafa2c	ram/COLO: Record the dirty pages that SVM received We record the address of the dirty pages that received, it will help flushing pages that cached into SVM. Here, it is a trick, we record dirty pages by re-using migration dirty bitmap. In the later patch, we will start the dirty log for SVM, just like migration, in this way, we can record both the dirty pages caused by PVM and SVM, we only flush those dirty pages from RAM cache while do checkpoint. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	13af18f222	COLO: Load dirty pages into SVM's RAM cache firstly We should not load PVM's state directly into SVM, because there maybe some errors happen when SVM is receving data, which will break SVM. We need to ensure receving all data before load the state into SVM. We use an extra memory to cache these data (PVM's ram). The ram cache in secondary side is initially the same as SVM/PVM's memory. And in the process of checkpoint, we cache the dirty pages of PVM into this ram cache firstly, so this ram cache always the same as PVM's memory at every checkpoint, then we flush this cached ram to SVM after we receive all PVM's state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	aad555c229	COLO: Remove colo_state migration struct We need to know if migration is going into COLO state for incoming side before start normal migration. Instead by using the VMStateDescription to send colo_state from source side to destination side, we use MIG_CMD_ENABLE_COLO to indicate whether COLO is enabled or not. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	8e48ac9586	COLO: Add block replication into colo process Make sure master start block replication after slave's block replication started. Besides, we need to activate VM's blocks before goes into COLO state. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	131b2153fc	COLO: integrate colo compare with colo frame For COLO FT, both the PVM and SVM run at the same time, only sync the state while it needs. So here, let SVM runs while not doing checkpoint, change DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100. Besides, we forgot to release colo_checkpoint_semd and colo_delay_timer, fix them here. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	dccd0313b6	colo-compare: use notifier to notify packets comparing result It's a good idea to use notifier to notify COLO frame of inconsistent packets comparing. Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	0ffcece325	colo-compare: implement the process of checkpoint While do checkpoint, we need to flush all the unhandled packets, By using the filter notifier mechanism, we can easily to notify every compare object to do this process, which runs inside of compare threads as a coroutine. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Zhang Chen	6214231abd	filter-rewriter: Add TCP state machine and fix memory leak in connection_track_table We add almost full TCP state machine in filter-rewriter, except TCPS_LISTEN and some simplify in VM active close FIN states. The reason for this simplify job is because guest kernel will track the TCP status and wait 2MSL time too, if client resend the FIN packet, guest will resend the last ACK, so we needn't wait 2MSL time in filter-rewriter. After a net connection is closed, we didn't clear its related resources in connection_track_table, which will lead to memory leak. Let's track the state of net connection, if it is closed, its related resources will be cleared up. Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com> Signed-off-by: Zhang Chen <zhangckid@gmail.com> Signed-off-by: Zhang Chen <chen.zhang@intel.com> Signed-off-by: Jason Wang <jasowang@redhat.com>	2018-10-19 11:15:03 +08:00
Emilio G. Cota	403f290c06	cputlb: read CPUTLBEntry.addr_write atomically Updates can come from other threads, so readers that do not take tlb_lock must use atomic_read to avoid undefined behaviour (UB). This completes the conversion to tlb_lock. This conversion results on average in no performance loss, as the following experiments (run on an Intel i7-6700K CPU @ 4.00GHz) show. 1. aarch64 bootup+shutdown test: - Before: Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs): 7487.087786 task-clock (msec) # 0.998 CPUs utilized ( +- 0.12% ) 31,574,905,303 cycles # 4.217 GHz ( +- 0.12% ) 57,097,908,812 instructions # 1.81 insns per cycle ( +- 0.08% ) 10,255,415,367 branches # 1369.747 M/sec ( +- 0.08% ) 173,278,962 branch-misses # 1.69% of all branches ( +- 0.18% ) 7.504481349 seconds time elapsed ( +- 0.14% ) - After: Performance counter stats for 'taskset -c 0 ../img/aarch64/die.sh' (10 runs): 7462.441328 task-clock (msec) # 0.998 CPUs utilized ( +- 0.07% ) 31,478,476,520 cycles # 4.218 GHz ( +- 0.07% ) 57,017,330,084 instructions # 1.81 insns per cycle ( +- 0.05% ) 10,251,929,667 branches # 1373.804 M/sec ( +- 0.05% ) 173,023,787 branch-misses # 1.69% of all branches ( +- 0.11% ) 7.474970463 seconds time elapsed ( +- 0.07% ) 2. SPEC06int: SPEC06int (test set) [Y axis: Speedup over master] 1.15 +-+----+------+------+------+------+------+-------+------+------+------+------+------+------+----+-+ \| \| 1.1 +-+.................................+++.............................+ tlb-lock-v2 (m+++x) +-+ \| +++ \| +++ tlb-lock-v3 (spinl\|ck) \| \| +++ \| \| +++ +++ \| \| \| 1.05 +-+....+++...........####.........\|####.+++.\|......\|.....###....+++...........+++....###.........+-+ \| ### ++#\| # \|# \|# *### +++### +++#+# \| +++ \| #\|# ### \| 1 +-++++#++++####+++#++#++++++++++#++#++++#++++#+#+**+#++++###++++###++++###++++#+#++++#+#+++-+ \| +* # #++# * # #### * # * ++# **+# \| * # ***\|# \|# # #\|# #+# # # \| 0.95 +-+....#....#..#.\|..#...#..#.\|..#....#.\|..#.++.#.+++#.**.#....#+#....#.#..++#.#..+-+ \| * # # # \| # # # \| # * * # ++ # * * # * * # * \|* # ++# # # # *** # \| \| * * # ++# # + # # # \| # * * # * * # * * # * * # ++ # **** # ++# # * * # \| 0.9 +-+....#...\|#..#....#.++#..#.\|..#....#....#....#....#....#..\|.#...\|#.#....#..+-+ \| * * # *** # * * # \|# # + # * * # * * # * * # * * # * * # ++ # \|# # * * # \| 0.85 +-+....#..\|..#....#.**..#....#....#....#....#....#....#....#.**.#....#..+-+ \| * # + # * * # \| # * * # * * # * * # * * # * * # * * # * * # * \|* # * * # \| \| * * # * * # * * # + # * * # * * # * * # * * # * * # * * # * * # * \|* # * * # \| 0.8 +-+....#.....#....#....#....#....#....#....#....#....#....#.++.#....#..+-+ \| * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # * * # \| 0.75 +-+--*##--###-###-###-###-###-*##-##-##-##-##-##--*##--+-+ 400.perlben401.bzip2403.gcc429.m445.gob456.hmme45462.libqua464.h26471.omnet473483.xalancbmkgeomean png: https://imgur.com/a/BHzpPTW Notes: - tlb-lock-v2 corresponds to an implementation with a mutex. - tlb-lock-v3 corresponds to the current implementation, i.e. a spinlock and a single lock acquisition in tlb_set_page_with_attrs. Signed-off-by: Emilio G. Cota <cota@braap.org> Message-Id: <20181016153840.25877-1-cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 19:46:53 -07:00
Richard Henderson	830bf10c82	target/s390x: Check HAVE_ATOMIC128 and HAVE_CMPXCHG128 at translate Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 19:46:53 -07:00
Richard Henderson	72d8ad67ba	target/s390x: Skip wout, cout helpers if op helper does not return When op raises an exception, it may not have initialized the output temps that would be written back by wout or cout. Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 19:46:53 -07:00
Richard Henderson	0c9fa16805	target/s390x: Split do_cdsg, do_lpq, do_stpq Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 19:46:53 -07:00
Richard Henderson	5e95612e2e	target/s390x: Convert to HAVE_CMPXCHG128 and HAVE_ATOMIC128 Reviewed-by: David Hildenbrand <david@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 19:46:53 -07:00
Richard Henderson	f34ec0f6d7	target/ppc: Convert to HAVE_CMPXCHG128 and HAVE_ATOMIC128 Reviewed-by: Emilio G. Cota <cota@braap.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 19:46:53 -07:00
Richard Henderson	62823083b8	target/arm: Check HAVE_CMPXCHG128 at translate time Reviewed-by: Emilio G. Cota <cota@braap.org> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>	2018-10-18 19:46:53 -07:00

1 2 3 4 5 ...

64501 Commits