qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Vitaly Kuznetsov	3aae0854b2	i386: Hyper-V Direct TLB flush hypercall Hyper-V TLFS allows for L0 and L1 hypervisors to collaborate on L2's TLB flush hypercalls handling. With the correct setup, L2's TLB flush hypercalls can be handled by L0 directly, without the need to exit to L1. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	aa6bb5fad5	i386: Hyper-V Support extended GVA ranges for TLB flush hypercalls KVM kind of supported "extended GVA ranges" (up to 4095 additional GFNs per hypercall) since the implementation of Hyper-V PV TLB flush feature (Linux-4.18) as regardless of the request, full TLB flush was always performed. "Extended GVA ranges for TLB flush hypercalls" feature bit wasn't exposed then. Now, as KVM gains support for fine-grained TLB flush handling, exposing this feature starts making sense. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	9411e8b6fa	i386: Hyper-V XMM fast hypercall input feature Hyper-V specification allows to pass parameters for certain hypercalls using XMM registers ("XMM Fast Hypercall Input"). When the feature is in use, it allows for faster hypercalls processing as KVM can avoid reading guest's memory. KVM supports the feature since v5.14. Rename HV_HYPERCALL_{PARAMS_XMM_AVAILABLE -> XMM_INPUT_AVAILABLE} to comply with KVM. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	869840d26c	i386: Hyper-V Enlightened MSR bitmap feature The newly introduced enlightenment allow L0 (KVM) and L1 (Hyper-V) hypervisors to collaborate to avoid unnecessary updates to L2 MSR-Bitmap upon vmexits. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-3-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Vitaly Kuznetsov	7110fe56c1	i386: Use hv_build_cpuid_leaf() for HV_CPUID_NESTED_FEATURES Previously, HV_CPUID_NESTED_FEATURES.EAX CPUID leaf was handled differently as it was only used to encode the supported eVMCS version range. In fact, there are also feature (e.g. Enlightened MSR-Bitmap) bits there. In preparation to adding these features, move HV_CPUID_NESTED_FEATURES leaf handling to hv_build_cpuid_leaf() and drop now-unneeded 'hyperv_nested'. No functional change intended. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20220525115949.1294004-2-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-25 21:26:35 +02:00
Yang Weijiang	3a7a27cffb	target/i386: Remove LBREn bit check when access Arch LBR MSRs Live migration can happen when Arch LBR LBREn bit is cleared, e.g., when migration happens after guest entered SMM mode. In this case, we still need to migrate Arch LBR MSRs. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220517155024.33270-1-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-23 10:56:01 +02:00
Yang Weijiang	12703d4e75	target/i386: Add MSR access interface for Arch LBR In the first generation of Arch LBR, the max support Arch LBR depth is 32, both host and guest use the value to set depth MSR. This can simplify the implementation of patch given the side-effect of mismatch of host/guest depth MSR: XRSTORS will reset all recording MSRs to 0s if the saved depth mismatches MSR_ARCH_LBR_DEPTH. In most of the cases Arch LBR is not in active status, so check the control bit before save/restore the big chunck of Arch LBR MSRs. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220215195258.29149-7-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-14 12:32:41 +02:00
Yang Weijiang	5a778a5f82	target/i386: Add kvm_get_one_msr helper When try to get one msr from KVM, I found there's no such kind of existing interface while kvm_put_one_msr() is there. So here comes the patch. It'll remove redundant preparation code before finally call KVM_GET_MSRS IOCTL. No functional change intended. Signed-off-by: Yang Weijiang <weijiang.yang@intel.com> Message-Id: <20220215195258.29149-4-weijiang.yang@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-05-14 12:32:41 +02:00
Jon Doron	d8701185f4	hw: hyperv: Initial commit for Synthetic Debugging device Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-5-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:56 +02:00
Jon Doron	73d2407407	hyperv: Add support to process syndbg commands SynDbg commands can come from two different flows: 1. Hypercalls, in this mode the data being sent is fully encapsulated network packets. 2. SynDbg specific MSRs, in this mode only the data that needs to be transfered is passed. Signed-off-by: Jon Doron <arilou@gmail.com> Reviewed-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Message-Id: <20220216102500.692781-4-arilou@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:56 +02:00
luofei	cb48748af7	i386: Set MCG_STATUS_RIPV bit for mce SRAR error In the physical machine environment, when a SRAR error occurs, the IA32_MCG_STATUS RIPV bit is set, but qemu does not set this bit. When qemu injects an SRAR error into virtual machine, the virtual machine kernel just call do_machine_check() to kill the current task, but not call memory_failure() to isolate the faulty page, which will cause the faulty page to be allocated and used repeatedly. If used by the virtual machine kernel, it will cause the virtual machine to crash Signed-off-by: luofei <luofei@unicloud.com> Message-Id: <20220120084634.131450-1-luofei@unicloud.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-23 12:22:25 +01:00
Philippe Mathieu-Daudé	dcebbb65b8	target/i386/kvm: Free xsave_buf when destroying vCPU Fix vCPU hot-unplug related leak reported by Valgrind: ==132362== 4,096 bytes in 1 blocks are definitely lost in loss record 8,440 of 8,549 ==132362== at 0x4C3B15F: memalign (vg_replace_malloc.c:1265) ==132362== by 0x4C3B288: posix_memalign (vg_replace_malloc.c:1429) ==132362== by 0xB41195: qemu_try_memalign (memalign.c:53) ==132362== by 0xB41204: qemu_memalign (memalign.c:73) ==132362== by 0x7131CB: kvm_init_xsave (kvm.c:1601) ==132362== by 0x7148ED: kvm_arch_init_vcpu (kvm.c:2031) ==132362== by 0x91D224: kvm_init_vcpu (kvm-all.c:516) ==132362== by 0x9242C9: kvm_vcpu_thread_fn (kvm-accel-ops.c:40) ==132362== by 0xB2EB26: qemu_thread_start (qemu-thread-posix.c:556) ==132362== by 0x7EB2159: start_thread (in /usr/lib64/libpthread-2.28.so) ==132362== by 0x9D45DD2: clone (in /usr/lib64/libc-2.28.so) Reported-by: Mark Kanda <mark.kanda@oracle.com> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Tested-by: Mark Kanda <mark.kanda@oracle.com> Message-Id: <20220322120522.26200-1-philippe.mathieu.daude@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-23 12:22:25 +01:00
Paolo Bonzini	3ec5ad4008	target/i386: kvm: do not access uninitialized variable on older kernels KVM support for AMX includes a new system attribute, KVM_X86_XCOMP_GUEST_SUPP. Commit `19db68ca68` ("x86: Grant AMX permission for guest", 2022-03-15) however did not fully consider the behavior on older kernels. First, it warns too aggressively. Second, it invokes the KVM_GET_DEVICE_ATTR ioctl unconditionally and then uses the "bitmask" variable, which remains uninitialized if the ioctl fails. Third, kvm_ioctl returns -errno rather than -1 on errors. While at it, explain why the ioctl is needed and KVM_GET_SUPPORTED_CPUID is not enough. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-20 20:38:52 +01:00
Zeng Guang	cdec2b753b	x86: Support XFD and AMX xsave data migration XFD(eXtended Feature Disable) allows to enable a feature on xsave state while preventing specific user threads from using the feature. Support save and restore XFD MSRs if CPUID.D.1.EAX[4] enumerate to be valid. Likewise migrate the MSRs and related xsave state necessarily. Signed-off-by: Zeng Guang <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-8-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Jing Liu	e56dd3c70a	x86: add support for KVM_CAP_XSAVE2 and AMX state migration When dynamic xfeatures (e.g. AMX) are used by the guest, the xsave area would be larger than 4KB. KVM_GET_XSAVE2 and KVM_SET_XSAVE under KVM_CAP_XSAVE2 works with a xsave buffer larger than 4KB. Always use the new ioctls under KVM_CAP_XSAVE2 when KVM supports it. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Zeng Guang <guang.zeng@intel.com> Signed-off-by: Wei Wang <wei.w.wang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-7-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Jing Liu	f21a48171c	x86: Add AMX CPUIDs enumeration Add AMX primary feature bits XFD and AMX_TILE to enumerate the CPU's AMX capability. Meanwhile, add AMX TILE and TMUL CPUID leaf and subleaves which exist when AMX TILE is present to provide the maximum capability of TILE and TMUL. Signed-off-by: Jing Liu <jing2.liu@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20220217060434.52460-6-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Yang Zhong	19db68ca68	x86: Grant AMX permission for guest Kernel allocates 4K xstate buffer by default. For XSAVE features which require large state component (e.g. AMX), Linux kernel dynamically expands the xstate buffer only after the process has acquired the necessary permissions. Those are called dynamically- enabled XSAVE features (or dynamic xfeatures). There are separate permissions for native tasks and guests. Qemu should request the guest permissions for dynamic xfeatures which will be exposed to the guest. This only needs to be done once before the first vcpu is created. KVM implemented one new ARCH_GET_XCOMP_SUPP system attribute API to get host side supported_xcr0 and Qemu can decide if it can request dynamically enabled XSAVE features permission. https://lore.kernel.org/all/20220126152210.3044876-1-pbonzini@redhat.com/ Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Signed-off-by: Jing Liu <jing2.liu@intel.com> Message-Id: <20220217060434.52460-4-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:50:50 +01:00
Longpeng(Mike)	def4c5570c	kvm/msi: do explicit commit when adding msi routes We invoke the kvm_irqchip_commit_routes() for each addition to MSI route table, which is not efficient if we are adding lots of routes in some cases. This patch lets callers invoke the kvm_irqchip_commit_routes(), so the callers can decide how to optimize. [1] https://lists.gnu.org/archive/html/qemu-devel/2021-11/msg00967.html Signed-off-by: Longpeng <longpeng2@huawei.com> Message-Id: <20220222141116.2091-3-longpeng2@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-03-15 11:26:20 +01:00
Peter Maydell	5df022cf2e	osdep: Move memalign-related functions to their own header Move the various memalign-related functions out of osdep.h and into their own header, which we include only where they are used. While we're doing this, add some brief documentation comments. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Message-id: 20220226180723.1706285-10-peter.maydell@linaro.org	2022-03-07 13:16:49 +00:00
Paolo Bonzini	1520f8bb67	KVM: x86: ignore interrupt_bitmap field of KVM_GET/SET_SREGS This is unnecessary, because the interrupt would be retrieved and queued anyway by KVM_GET_VCPU_EVENTS and KVM_SET_VCPU_EVENTS respectively, and it makes the flow more similar to the one for KVM_GET/SET_SREGS2. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-01-12 14:09:06 +01:00
Maxim Levitsky	8f515d3869	KVM: use KVM_{GET\|SET}_SREGS2 when supported. This allows to make PDPTRs part of the migration stream and thus not reload them after migration which is against X86 spec. Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211101132300.192584-2-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-01-12 14:09:06 +01:00
Philippe Mathieu-Daudé	dc7d6cafce	target/i386/kvm: Replace use of __u32 type QEMU coding style mandates to not use Linux kernel internal types for scalars types. Replace __u32 by uint32_t. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Message-Id: <20211116193955.2793171-1-philmd@redhat.com> Signed-off-by: Laurent Vivier <laurent@vivier.eu>	2021-12-17 10:40:51 +01:00
Maxim Levitsky	cabf9862e4	KVM: SVM: add migration support for nested TSC scaling Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-Id: <20211101132300.192584-4-mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-11-02 15:57:27 +01:00
Philippe Mathieu-Daudé	deae846f94	target/i386/sev: Declare system-specific functions in 'sev.h' "sysemu/sev.h" is only used from x86-specific files. Let's move it to include/hw/i386, and merge it with target/i386/sev.h. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20211007161716.453984-16-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-13 10:47:49 +02:00
Philippe Mathieu-Daudé	93777de365	target/i386/sev: Rename sev_i386.h -> sev.h SEV is a x86 specific feature, and the "sev_i386.h" header is already in target/i386/. Rename it as "sev.h" to simplify. Patch created mechanically using: $ git mv target/i386/sev_i386.h target/i386/sev.h $ sed -i s/sev_i386.h/sev.h/ $(git grep -l sev_i386.h) Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20211007161716.453984-15-philmd@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-13 10:47:49 +02:00
Vitaly Kuznetsov	af7228b88d	i386: Make Hyper-V version id configurable Currently, we hardcode Hyper-V version id (CPUID 0x40000002) to WS2008R2 and it is known that certain tools in Windows check this. It seems useful to provide some flexibility by making it possible to change this info at will. CPUID information is defined in TLFS as: EAX: Build Number EBX Bits 31-16: Major Version Bits 15-0: Minor Version ECX Service Pack EDX Bits 31-24: Service Branch Bits 23-0: Service Number Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-8-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Vitaly Kuznetsov	e1f9a8e8c9	i386: Implement pseudo 'hv-avic' ('hv-apicv') enlightenment The enlightenment allows to use Hyper-V SynIC with hardware APICv/AVIC enabled. Normally, Hyper-V SynIC disables these hardware features and suggests the guest to use paravirtualized AutoEOI feature. Linux-4.15 gains support for conditional APICv/AVIC disablement, the feature stays on until the guest tries to use AutoEOI feature with SynIC. With 'HV_DEPRECATING_AEOI_RECOMMENDED' bit exposed, modern enough Windows/ Hyper-V versions should follow the recommendation and not use the (unwanted) feature. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-7-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Vitaly Kuznetsov	050716292a	i386: Move HV_APIC_ACCESS_RECOMMENDED bit setting to hyperv_fill_cpuids() In preparation to enabling Hyper-V + APICv/AVIC move HV_APIC_ACCESS_RECOMMENDED setting out of kvm_hyperv_properties[]: the 'real' feature bit for the vAPIC features is HV_APIC_ACCESS_AVAILABLE, HV_APIC_ACCESS_RECOMMENDED is a recommendation to use the feature which we may not always want to give. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-6-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Vitaly Kuznetsov	70367f0917	i386: Support KVM_CAP_HYPERV_ENFORCE_CPUID By default, KVM allows the guest to use all currently supported Hyper-V enlightenments when Hyper-V CPUID interface was exposed, regardless of if some features were not announced in guest visible CPUIDs. hv-enforce-cpuid feature alters this behavior and only allows the guest to use exposed Hyper-V enlightenments. The feature is supported by Linux >= 5.14 and is not enabled by default in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-5-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Vitaly Kuznetsov	988f7b8bfe	i386: Support KVM_CAP_ENFORCE_PV_FEATURE_CPUID By default, KVM allows the guest to use all currently supported PV features even when they were not announced in guest visible CPUIDs. Introduce a new "kvm-pv-enforce-cpuid" flag to limit the supported feature set to the exposed features. The feature is supported by Linux >= 5.10 and is not enabled by default in QEMU. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210902093530.345756-4-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-10-01 19:04:45 +02:00
Peter Xu	142518bda5	memory: Name all the memory listeners Provide a name field for all the memory listeners. It can be used to identify which memory listener is which. Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: David Hildenbrand <david@redhat.com> Message-Id: <20210817013553.30584-2-peterx@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 15:30:24 +02:00
Sean Christopherson	b9edbadefb	i386: Propagate SGX CPUID sub-leafs to KVM The SGX sub-leafs are enumerated at CPUID 0x12. Indices 0 and 1 are always present when SGX is supported, and enumerate SGX features and capabilities. Indices >=2 are directly correlated with the platform's EPC sections. Because the number of EPC sections is dynamic and user defined, the number of SGX sub-leafs is "NULL" terminated. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-15-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 14:50:20 +02:00
Sean Christopherson	c22f546785	i386: kvm: Add support for exposing PROVISIONKEY to guest If the guest want to fully use SGX, the guest needs to be able to access provisioning key. Add a new KVM_CAP_SGX_ATTRIBUTE to KVM to support provisioning key to KVM guests. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-14-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 14:50:20 +02:00
Sean Christopherson	a04835414b	i386: Add feature control MSR dependency when SGX is enabled SGX adds multiple flags to FEATURE_CONTROL to enable SGX and Flexible Launch Control. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-12-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 14:50:20 +02:00
Sean Christopherson	db88806523	i386: Add get/set/migrate support for SGX_LEPUBKEYHASH MSRs On real hardware, on systems that supports SGX Launch Control, those MSRs are initialized to digest of Intel's signing key; on systems that don't support SGX Launch Control, those MSRs are not available but hardware always uses digest of Intel's signing key in EINIT. KVM advertises SGX LC via CPUID if and only if the MSRs are writable. Unconditionally initialize those MSRs to digest of Intel's signing key when CPU is realized and reset to reflect the fact. This avoids potential bug in case kvm_arch_put_registers() is called before kvm_arch_get_registers() is called, in which case guest's virtual SGX_LEPUBKEYHASH MSRs will be set to 0, although KVM initializes those to digest of Intel's signing key by default, since KVM allows those MSRs to be updated by Qemu to support live migration. Save/restore the SGX Launch Enclave Public Key Hash MSRs if SGX Launch Control (LC) is exposed to the guest. Likewise, migrate the MSRs if they are writable by the guest. Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com> Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Yang Zhong <yang.zhong@intel.com> Message-Id: <20210719112136.57018-11-yang.zhong@intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 14:50:20 +02:00
Markus Armbruster	436c831a28	migration: Unify failure check for migrate_add_blocker() Most callers check the return value. Some check whether it set an error. Functionally equivalent, but the former tends to be easier on the eyes, so do that everywhere. Prior art: commit `c6ecec43b2` "qemu-option: Check return value instead of @err where convenient". Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-10-armbru@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-26 17:15:28 +02:00
Markus Armbruster	a5c051b2cf	i386: Never free migration blocker objects instead of sometimes invtsc_mig_blocker has static storage duration. When a CPU with certain features is initialized, and invtsc_mig_blocker is still null, we add a migration blocker and store it in invtsc_mig_blocker. The object is freed when migrate_add_blocker() fails, leaving invtsc_mig_blocker dangling. It is not freed on later failures. Same for hv_passthrough_mig_blocker and hv_no_nonarch_cs_mig_blocker. All failures are actually fatal, so whether we free or not doesn't really matter, except as bad examples to be copied / imitated. Clean this up in a minimal way: never free these blocker objects. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Marcelo Tosatti <mtosatti@redhat.com> Cc: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <20210720125408.387910-7-armbru@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com>	2021-08-26 17:15:28 +02:00
Vitaly Kuznetsov	e4adb09f79	i386: assert 'cs->kvm_state' is not null Coverity reports potential NULL pointer dereference in get_supported_hv_cpuid_legacy() when 'cs->kvm_state' is NULL. While 'cs->kvm_state' can indeed be NULL in hv_cpuid_get_host(), kvm_hyperv_expand_features() makes sure that it only happens when KVM_CAP_SYS_HYPERV_CPUID is supported and KVM_CAP_SYS_HYPERV_CPUID implies KVM_CAP_HYPERV_CPUID so get_supported_hv_cpuid_legacy() is never really called. Add asserts to strengthen the protection against broken KVM behavior. Coverity: CID 1458243 Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210716115852.418293-1-vkuznets@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-07-29 10:15:51 +02:00
Vitaly Kuznetsov	cce087f628	i386: Hyper-V SynIC requires POST_MESSAGES/SIGNAL_EVENTS privileges When Hyper-V SynIC is enabled, we may need to allow Windows guests to make hypercalls (POST_MESSAGES/SIGNAL_EVENTS). No issue is currently observed because KVM is very permissive, allowing these hypercalls regarding of guest visible CPUid bits. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-9-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-13 09:13:29 -04:00
Vitaly Kuznetsov	b26f68c36b	i386: HV_HYPERCALL_AVAILABLE privilege bit is always needed According to TLFS, Hyper-V guest is supposed to check HV_HYPERCALL_AVAILABLE privilege bit before accessing HV_X64_MSR_GUEST_OS_ID/HV_X64_MSR_HYPERCALL MSRs but at least some Windows versions ignore that. As KVM is very permissive and allows accessing these MSRs unconditionally, no issue is observed. We may, however, want to tighten the checks eventually. Conforming to the spec is probably also a good idea. Enable HV_HYPERCALL_AVAILABLE bit unconditionally. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-8-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-13 09:13:29 -04:00
Vitaly Kuznetsov	5ce48fa354	i386: kill off hv_cpuid_check_and_set() hv_cpuid_check_and_set() does too much: - Checks if the feature is supported by KVM; - Checks if all dependencies are enabled; - Sets the feature bit in cpu->hyperv_features for 'passthrough' mode. To reduce the complexity, move all the logic except for dependencies check out of it. Also, in 'passthrough' mode we don't really need to check dependencies because KVM is supposed to provide a consistent set anyway. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-7-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-13 09:13:29 -04:00
Vitaly Kuznetsov	071ce4b03b	i386: expand Hyper-V features during CPU feature expansion time To make Hyper-V features appear in e.g. QMP query-cpu-model-expansion we need to expand and set the corresponding CPUID leaves early. Modify x86_cpu_get_supported_feature_word() to call newly intoduced Hyper-V specific kvm_hv_get_supported_cpuid() instead of kvm_arch_get_supported_cpuid(). We can't use kvm_arch_get_supported_cpuid() as Hyper-V specific CPUID leaves intersect with KVM's. Note, early expansion will only happen when KVM supports system wide KVM_GET_SUPPORTED_HV_CPUID ioctl (KVM_CAP_SYS_HYPERV_CPUID). Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-6-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-13 09:13:29 -04:00
Vitaly Kuznetsov	d7652b772f	i386: make hyperv_expand_features() return bool Return 'false' when hyperv_expand_features() sets an error. No functional change intended. Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Message-Id: <20210608120817.1325125-5-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-13 09:13:29 -04:00
Vitaly Kuznetsov	07454e2ea8	i386: hardcode supported eVMCS version to '1' Currently, the only eVMCS version, supported by KVM (and described in TLFS) is '1'. When Enlightened VMCS feature is enabled, QEMU takes the supported eVMCS version range (from KVM_CAP_HYPERV_ENLIGHTENED_VMCS enablement) and puts it to guest visible CPUIDs. When (and if) eVMCS ver.2 appears a problem on migration is expected: it doesn't seem to be possible to migrate from a host supporting eVMCS ver.2 to a host, which only support eVMCS ver.1. Hardcode eVMCS ver.1 as the result of 'hv-evmcs' enablement for now. Newer eVMCS versions will have to have their own enablement options (e.g. 'hv-evmcs=2'). Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20210608120817.1325125-4-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-07-13 09:13:29 -04:00
David Edmondson	fea4500841	target/i386: Populate x86_ext_save_areas offsets using cpuid where possible Rather than relying on the X86XSaveArea structure definition, determine the offset of XSAVE state areas using CPUID leaf 0xd where possible (KVM and HVF). Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210705104632.2902400-8-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-07-06 08:33:48 +02:00
David Edmondson	c0198c5f87	target/i386: Pass buffer and length to XSAVE helper In preparation for removing assumptions about XSAVE area offsets, pass a buffer pointer and buffer length to the XSAVE helper functions. Signed-off-by: David Edmondson <david.edmondson@oracle.com> Message-Id: <20210705104632.2902400-5-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-07-06 07:54:53 +02:00
David Edmondson	436463b84b	target/i386: Consolidate the X86XSaveArea offset checks Rather than having similar but different checks in cpu.h and kvm.c, move them all to cpu.h. Message-Id: <20210705104632.2902400-3-david.edmondson@oracle.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-07-06 07:54:53 +02:00
Paolo Bonzini	9ce8af4d92	target/i386: kvm: add support for TSC scaling Linux 5.14 will add support for nested TSC scaling. Add the corresponding feature in QEMU; to keep support for existing kernels, do not add it to any processor yet. The handling of the VMCS enumeration MSR is ugly; once we have more than one case, we may want to add a table to check VMX features against. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-06-25 10:53:46 +02:00
Chenyi Qiang	035d1ef265	i386: Add ratelimit for bus locks acquired in guest A bus lock is acquired through either split locked access to writeback (WB) memory or any locked access to non-WB memory. It is typically >1000 cycles slower than an atomic operation within a cache and can also disrupts performance on other cores. Virtual Machines can exploit bus locks to degrade the performance of system. To address this kind of performance DOS attack coming from the VMs, bus lock VM exit is introduced in KVM and it can report the bus locks detected in guest. If enabled in KVM, it would exit to the userspace to let the user enforce throttling policies once bus locks acquired in VMs. The availability of bus lock VM exit can be detected through the KVM_CAP_X86_BUS_LOCK_EXIT. The returned bitmap contains the potential policies supported by KVM. The field KVM_BUS_LOCK_DETECTION_EXIT in bitmap is the only supported strategy at present. It indicates that KVM will exit to userspace to handle the bus locks. This patch adds a ratelimit on the bus locks acquired in guest as a mitigation policy. Introduce a new field "bus_lock_ratelimit" to record the limited speed of bus locks in the target VM. The user can specify it through the "bus-lock-ratelimit" as a machine property. In current implementation, the default value of the speed is 0 per second, which means no restrictions on the bus locks. As for ratelimit on detected bus locks, simply set the ratelimit interval to 1s and restrict the quota of bus lock occurence to the value of "bus_lock_ratelimit". A potential alternative is to introduce the time slice as a property which can help the user achieve more precise control. The detail of bus lock VM exit can be found in spec: https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com> Message-Id: <20210521043820.29678-1-chenyi.qiang@intel.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-06-17 14:11:06 -04:00
Vitaly Kuznetsov	5aa9ef5e4b	i386: use global kvm_state in hyperv_enabled() check There is no need to use vCPU-specific kvm state in hyperv_enabled() check and we need to do that when feature expansion happens early, before vCPU specific KVM state is created. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> Message-Id: <20210422161130.652779-15-vkuznets@redhat.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2021-05-31 15:53:03 -04:00

1 2

76 Commits