Commit Graph

676 Commits

Author SHA1 Message Date
Sunil Muthuswamy b902710f78 WHPX: refactor load library
This refactors the load library of WHV libraries to make it more
modular. It makes a helper routine that can be called on demand.
This allows future expansion of load library/functions to support
functionality that is dependent on some feature being available.

Signed-off-by: Sunil Muthuswamy <sunilmut@microsoft.com>
Message-Id: <MW2PR2101MB1116578040BE1F0C1B662318C0760@MW2PR2101MB1116.namprd21.prod.outlook.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-18 02:34:10 +01:00
Paolo Bonzini 89a289c7e9 x86: move more x86-generic functions out of PC files
These are needed by microvm too, so move them outside of PC-specific files.
With this patch, microvm.c need not include pc.h anymore.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:33:50 +01:00
Paolo Bonzini ed9e923c3c x86: move SMM property to X86MachineState
Add it to microvm as well, it is a generic property of the x86
architecture.

Suggested-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:33:50 +01:00
Paolo Bonzini 852c27e2ba hw: replace hw/i386/pc.h with a header just for the i8259
Remove the need to include i386/pc.h to get to the i8259 functions.
This is enough to remove the inclusion of hw/i386/pc.h from all non-x86
files.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:33:49 +01:00
Paolo Bonzini 4376c40ded kvm: introduce kvm_kernel_irqchip_* functions
The KVMState struct is opaque, so provide accessors for the fields
that will be moved from current_machine to the accelerator.  For now
they just forward to the machine object, but this will change.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:32:45 +01:00
Paolo Bonzini 23b0898e44 kvm: convert "-machine kvm_shadow_mem" to an accelerator property
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:32:27 +01:00
Paolo Bonzini 3c75e12ea6 qom: add object_new_with_class
Similar to CPU and machine classes, "-accel" class names are mangled,
so we have to first get a class via accel_find and then instantiate it.
Provide a new function to instantiate a class without going through
object_class_get_name, and use it for CPUs and machines already.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:32:26 +01:00
Eduardo Habkost 88703ce2e6 i386: Use g_autofree in a few places
Get rid of 12 explicit g_free() calls.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20191025025632.5928-1-ehabkost@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-12-13 16:32:19 -03:00
Cathy Zhang 22a866b616 i386: Add new CPU model Cooperlake
Cooper Lake is intel's successor to Cascade Lake, the new
CPU model inherits features from Cascadelake-Server, while
add one platform associated new feature: AVX512_BF16. Meanwhile,
add STIBP for speculative execution.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-4-git-send-email-cathy.zhang@intel.com>
Reviewed-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-12-13 16:32:19 -03:00
Cathy Zhang 5af514d0cb i386: Add macro for stibp
stibp feature is already added through the following commit.
0e89165829

Add a macro for it to allow CPU models to report it when host supports.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-3-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-12-13 16:32:19 -03:00
Cathy Zhang 77b168d221 i386: Add MSR feature bit for MDS-NO
Define MSR_ARCH_CAP_MDS_NO in the IA32_ARCH_CAPABILITIES MSR to allow
CPU models to report the feature when host supports it.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-2-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-12-13 16:32:19 -03:00
Yang Zhong 2605188240 target/i386: disable VMX features if nested=0
If kvm does not support VMX feature by nested=0, the kvm_vmx_basic
can't get the right value from MSR_IA32_VMX_BASIC register, which
make qemu coredump when qemu do KVM_SET_MSRS.

The coredump info:
error: failed to set MSR 0x480 to 0x0
kvm_put_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.

Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <20191206071111.12128-1-yang.zhong@intel.com>
Reported-by: Catherine Ho <catherine.hecx@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-06 12:35:40 +01:00
Cameron Esfahani 64bef038e7 hvf: correctly inject VMCS_INTR_T_HWINTR versus VMCS_INTR_T_SWINTR.
Previous implementation in hvf_inject_interrupts() would always inject
VMCS_INTR_T_SWINTR even when VMCS_INTR_T_HWINTR was required.  Now
correctly determine when VMCS_INTR_T_HWINTR is appropriate versus
VMCS_INTR_T_SWINTR.

Make sure to clear ins_len and has_error_code when ins_len isn't
valid and error_code isn't set.

Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <bf8d945ea1b423786d7802bbcf769517d1fd01f8.1575330463.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-03 09:11:42 +01:00
Cameron Esfahani e37aa8b0e4 hvf: more accurately match SDM when setting CR0 and PDPTE registers
More accurately match SDM when setting CR0 and PDPTE registers.

Clear PDPTE registers when resetting vcpus.

Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <464adb39c8699fb8331d8ad6016fc3e2eff53dbc.1574625592.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-26 09:58:37 +01:00
Cameron Esfahani 8c3b0e9e67 hvf: correctly handle REX prefix in relation to legacy prefixes
In real x86 processors, the REX prefix must come after legacy prefixes.
REX before legacy is ignored.  Update the HVF emulation code to properly
handle this.  Fix some spelling errors in constants.  Fix some decoder
table initialization issues found by Coverity.

Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <eff30ded8307471936bec5d84c3b6efbc95e3211.1574625592.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-26 09:58:36 +01:00
Cameron Esfahani 9fedbbeeee hvf: remove TSC synchronization code because it isn't fully complete
The existing code in QEMU's HVF support to attempt to synchronize TSC
across multiple cores is not sufficient.  TSC value on other cores
can go backwards.  Until implementation is fixed, remove calls to
hv_vm_sync_tsc().  Pass through TSC to guest OS.

Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <44c4afd2301b8bf99682b229b0796d84edd6d66f.1574625592.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-26 09:58:35 +01:00
Cameron Esfahani fbafbb6db7 hvf: non-RAM, non-ROMD memory ranges are now correctly mapped in
If an area is non-RAM and non-ROMD, then remove mappings so accesses
will trap and can be emulated.  Change hvf_find_overlap_slot() to take
a size instead of an end address: it wouldn't return a slot because
callers would pass the same address for start and end.  Don't always
map area as read/write/execute, respect area flags.

Signed-off-by: Cameron Esfahani <dirty@apple.com>
Message-Id: <1d8476c8f86959273fbdf23c86f8b4b611f5e2e1.1574625592.git.dirty@apple.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-26 09:58:33 +01:00
Paolo Bonzini c6f3215ffa target/i386: add two missing VMX features for Skylake and CascadeLake Server
They are present in client (Core) Skylake but pasted wrong into the server
SKUs.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-26 09:55:12 +01:00
Eduardo Habkost 02fa60d101 i386: Add -noTSX aliases for hle=off, rtm=off CPU models
We have been trying to avoid adding new aliases for CPU model
versions, but in the case of changes in defaults introduced by
the TAA mitigation patches, the aliases might help avoid user
confusion when applying host software updates.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 16:35:05 +01:00
Eduardo Habkost 9ab2237f19 i386: Add new versions of Skylake/Cascadelake/Icelake without TSX
One of the mitigation methods for TAA[1] is to disable TSX
support on the host system.  Linux added a mechanism to disable
TSX globally through the kernel command line, and many Linux
distributions now default to tsx=off.  This makes existing CPU
models that have HLE and RTM enabled not usable anymore.

Add new versions of all CPU models that have the HLE and RTM
features enabled, that can be used when TSX is disabled in the
host system.

References:

[1] TAA, TSX asynchronous Abort:
    https://software.intel.com/security-software-guidance/insights/deep-dive-intel-transactional-synchronization-extensions-intel-tsx-asynchronous-abort
    https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 16:35:05 +01:00
Paolo Bonzini 2a9758c51e target/i386: add support for MSR_IA32_TSX_CTRL
The MSR_IA32_TSX_CTRL MSR can be used to hide TSX (also known as the
Trusty Side-channel Extension).  By virtualizing the MSR, KVM guests
can disable TSX and avoid paying the price of mitigating TSX-based
attacks on microarchitectural side channels.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 16:35:05 +01:00
Paolo Bonzini 0723cc8a55 target/i386: add VMX features to named CPU models
This allows using "-cpu Haswell,+vmx", which we did not really want to
support in QEMU but was produced by Libvirt when using the "host-model"
CPU model.  Without this patch, no VMX feature is _actually_ supported
(only the basic instruction set extensions are) and KVM fails to load
in the guest.

This was produced from the output of scripts/kvm/vmxcap using the following
very ugly Python script:

    bits = {
            'INS/OUTS instruction information': ['FEAT_VMX_BASIC', 'MSR_VMX_BASIC_INS_OUTS'],
            'IA32_VMX_TRUE_*_CTLS support': ['FEAT_VMX_BASIC', 'MSR_VMX_BASIC_TRUE_CTLS'],
            'External interrupt exiting': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_EXT_INTR_MASK'],
            'NMI exiting': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_NMI_EXITING'],
            'Virtual NMIs': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_VIRTUAL_NMIS'],
            'Activate VMX-preemption timer': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_VMX_PREEMPTION_TIMER'],
            'Process posted interrupts': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_POSTED_INTR'],
            'Interrupt window exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_VIRTUAL_INTR_PENDING'],
            'Use TSC offsetting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_TSC_OFFSETING'],
            'HLT exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_HLT_EXITING'],
            'INVLPG exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_INVLPG_EXITING'],
            'MWAIT exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MWAIT_EXITING'],
            'RDPMC exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_RDPMC_EXITING'],
            'RDTSC exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_RDTSC_EXITING'],
            'CR3-load exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR3_LOAD_EXITING'],
            'CR3-store exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR3_STORE_EXITING'],
            'CR8-load exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR8_LOAD_EXITING'],
            'CR8-store exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR8_STORE_EXITING'],
            'Use TPR shadow': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_TPR_SHADOW'],
            'NMI-window exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_VIRTUAL_NMI_PENDING'],
            'MOV-DR exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MOV_DR_EXITING'],
            'Unconditional I/O exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_UNCOND_IO_EXITING'],
            'Use I/O bitmaps': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_IO_BITMAPS'],
            'Monitor trap flag': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MONITOR_TRAP_FLAG'],
            'Use MSR bitmaps': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_MSR_BITMAPS'],
            'MONITOR exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MONITOR_EXITING'],
            'PAUSE exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_PAUSE_EXITING'],
            'Activate secondary control': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_ACTIVATE_SECONDARY_CONTROLS'],
            'Virtualize APIC accesses': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES'],
            'Enable EPT': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_EPT'],
            'Descriptor-table exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_DESC'],
            'Enable RDTSCP': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDTSCP'],
            'Virtualize x2APIC mode': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE'],
            'Enable VPID': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_VPID'],
            'WBINVD exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_WBINVD_EXITING'],
            'Unrestricted guest': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_UNRESTRICTED_GUEST'],
            'APIC register emulation': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_APIC_REGISTER_VIRT'],
            'Virtual interrupt delivery': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY'],
            'PAUSE-loop exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_PAUSE_LOOP_EXITING'],
            'RDRAND exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDRAND_EXITING'],
            'Enable INVPCID': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_INVPCID'],
            'Enable VM functions': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_VMFUNC'],
            'VMCS shadowing': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_SHADOW_VMCS'],
            'RDSEED exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDSEED_EXITING'],
            'Enable PML': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_PML'],
            'Enable XSAVES/XRSTORS': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_XSAVES'],
            'Save debug controls': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_DEBUG_CONTROLS'],
            'Load IA32_PERF_GLOBAL_CTRL': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL'],
            'Acknowledge interrupt on exit': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_ACK_INTR_ON_EXIT'],
            'Save IA32_PAT': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_IA32_PAT'],
            'Load IA32_PAT': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_PAT'],
            'Save IA32_EFER': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_IA32_EFER'],
            'Load IA32_EFER': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_EFER'],
            'Save VMX-preemption timer value': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_VMX_PREEMPTION_TIMER'],
            'Clear IA32_BNDCFGS': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_CLEAR_BNDCFGS'],
            'Load debug controls': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_DEBUG_CONTROLS'],
            'IA-32e mode guest': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_IA32E_MODE'],
            'Load IA32_PERF_GLOBAL_CTRL': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL'],
            'Load IA32_PAT': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_PAT'],
            'Load IA32_EFER': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_EFER'],
            'Load IA32_BNDCFGS': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_BNDCFGS'],
            'Store EFER.LMA into IA-32e mode guest control': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_STORE_LMA'],
            'HLT activity state': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_ACTIVITY_HLT'],
            'VMWRITE to VM-exit information fields': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_VMWRITE_VMEXIT'],
            'Inject event with insn length=0': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_ZERO_LEN_INJECT'],
            'Execute-only EPT translations': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_EXECONLY'],
            'Page-walk length 4': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_PAGE_WALK_LENGTH_4'],
            'Paging-structure memory type WB': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_WB'],
            '2MB EPT pages': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_2MB | MSR_VMX_EPT_1GB'],
            'INVEPT supported': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT'],
            'EPT accessed and dirty flags': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_AD_BITS'],
            'Single-context INVEPT': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT_SINGLE_CONTEXT'],
            'All-context INVEPT': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT_ALL_CONTEXT'],
            'INVVPID supported': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID'],
            'Individual-address INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_ADDR'],
            'Single-context INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT'],
            'All-context INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_ALL_CONTEXT'],
            'Single-context-retaining-globals INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT_NOGLOBALS'],
            'EPTP Switching': ['FEAT_VMX_VMFUNC', 'MSR_VMX_VMFUNC_EPT_SWITCHING']
    }

    import sys
    import textwrap

    out = {}
    for l in sys.stdin.readlines():
        l = l.rstrip()
        if l.endswith('!!'):
            l = l[:-2].rstrip()
        if l.startswith('    ') and (l.endswith('default') or l.endswith('yes')):
            l = l[4:]
            for key, value in bits.items():
                if l.startswith(key):
                    ctl, bit = value
                    if ctl in out:
                        out[ctl] = out[ctl] + ' | '
                    else:
                        out[ctl] = '    [%s] = ' % ctl
                    out[ctl] = out[ctl] + bit

    for x in sorted(out.keys()):
        print("\n         ".join(textwrap.wrap(out[x] + ",")))

Note that the script has a bug in that some keys apply to both VM entry
and VM exit controls ("load IA32_PERF_GLOBAL_CTRL", "load IA32_EFER",
"load IA32_PAT".  Those have to be fixed by hand.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 16:33:53 +01:00
Liam Merwick 2f34ebf222 hw/i386: Move save_tsc_khz from PCMachineClass to X86MachineClass
Attempting to migrate a VM using the microvm machine class results in the source
QEMU aborting with the following message/backtrace:

target/i386/machine.c:955:tsc_khz_needed: Object 0x555556608fa0 is not an
instance of type generic-pc-machine

abort()
object_class_dynamic_cast_assert()
vmstate_save_state_v()
vmstate_save_state()
vmstate_save()
qemu_savevm_state_complete_precopy()
migration_thread()
migration_thread()
migration_thread()
qemu_thread_start()
start_thread()
clone()

The access to the machine class returned by MACHINE_GET_CLASS() in
tsc_khz_needed() is crashing as it is trying to dereference a different
type of machine class object (TYPE_PC_MACHINE) to that of this microVM.

This can be resolved by extending the changes in the following commit
f0bb276bf8 ("hw/i386: split PCMachineState deriving X86MachineState from it")
and moving the save_tsc_khz field in PCMachineClass to X86MachineClass.

Fixes: f0bb276bf8 ("hw/i386: split PCMachineState deriving X86MachineState from it")
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-Id: <1574075605-25215-1-git-send-email-liam.merwick@oracle.com>
Reviewed-by: Sergio Lopez <slp@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-19 10:01:34 +01:00
Pawan Gupta 7fac38635e target/i386: Export TAA_NO bit to guests
TSX Async Abort (TAA) is a side channel attack on internal buffers in
some Intel processors similar to Microachitectural Data Sampling (MDS).

Some future Intel processors will use the ARCH_CAP_TAA_NO bit in the
IA32_ARCH_CAPABILITIES MSR to report that they are not vulnerable to
TAA. Make this bit available to guests.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-19 10:01:32 +01:00
Paolo Bonzini 7f7a585d5b target/i386: add PSCHANGE_NO bit for the ARCH_CAPABILITIES MSR
This is required to disable ITLB multihit mitigations in nested
hypervisors.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-19 10:00:36 +01:00
Emilio G. Cota aaf4b62e19 target/i386: fetch code with translator_ld
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2019-10-28 15:12:38 +00:00
Paolo Bonzini bf13bfab08 i386: implement IGNNE
Change the handling of port F0h writes and FPU exceptions to implement IGNNE.

The implementation mixes a bit what the chipset and processor do in real
hardware, but the effect is the same as what happens with actual FERR#
and IGNNE# pins: writing to port F0h asserts IGNNE# in addition to lowering
FP_IRQ; while clearing the SE bit in the FPU status word deasserts IGNNE#.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-26 15:38:07 +02:00
Paolo Bonzini 5caa1833d2 target/i386: introduce cpu_set_fpus
In the next patch, this will provide a hook to detect clearing of
FSW.ES.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-26 15:38:07 +02:00
Paolo Bonzini 6f529b7534 target/i386: move FERR handling to target/i386
Move it out of pc.c since it is strictly tied to TCG.  This is
almost exclusively code movement, the next patch will implement
IGNNE.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-26 15:38:07 +02:00
Paolo Bonzini 673652a785 Merge commit 'df84f17' into HEAD
This merge fixes a semantic conflict with the trivial tree.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-26 15:38:02 +02:00
Tao Xu 8b44d8609f target/i386: Introduce Denverton CPU model
Denverton is the Atom Processor of Intel Harrisonville platform.

For more information:
https://ark.intel.com/content/www/us/en/ark/products/\
codename/63508/denverton.html

Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190718073405.28301-1-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-23 23:37:42 -03:00
Tao Xu 6508799707 target/i386: Add support for save/load IA32_UMWAIT_CONTROL MSR
UMWAIT and TPAUSE instructions use 32bits IA32_UMWAIT_CONTROL at MSR
index E1H to determines the maximum time in TSC-quanta that the processor
can reside in either C0.1 or C0.2.

This patch is to Add support for save/load IA32_UMWAIT_CONTROL MSR in
guest.

Co-developed-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191011074103.30393-3-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-23 17:50:27 +02:00
Tao Xu 67192a298f x86/cpu: Add support for UMONITOR/UMWAIT/TPAUSE
UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions.
This patch adds support for user wait instructions in KVM. Availability
of the user wait instructions is indicated by the presence of the CPUID
feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. User wait instructions may
be executed at any privilege level, and use IA32_UMWAIT_CONTROL MSR to
set the maximum time.

The patch enable the umonitor, umwait and tpause features in KVM.
Because umwait and tpause can put a (psysical) CPU into a power saving
state, by default we dont't expose it to kvm and enable it only when
guest CPUID has it. And use QEMU command-line "-overcommit cpu-pm=on"
(enable_cpu_pm is enabled), a VM can use UMONITOR, UMWAIT and TPAUSE
instructions. If the instruction causes a delay, the amount of time
delayed is called here the physical delay. The physical delay is first
computed by determining the virtual delay (the time to delay relative to
the VM’s timestamp counter). Otherwise, UMONITOR, UMWAIT and TPAUSE cause
an invalid-opcode exception(#UD).

The release document ref below link:
https://software.intel.com/sites/default/files/\
managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf

Co-developed-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191011074103.30393-2-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-23 17:50:27 +02:00
Vitaly Kuznetsov 30d6ff662d i386/kvm: add NoNonArchitecturalCoreSharing Hyper-V enlightenment
Hyper-V TLFS specifies this enlightenment as:
"NoNonArchitecturalCoreSharing - Indicates that a virtual processor will never
share a physical core with another virtual processor, except for virtual
processors that are reported as sibling SMT threads. This can be used as an
optimization to avoid the performance overhead of STIBP".

However, STIBP is not the only implication. It was found that Hyper-V on
KVM doesn't pass MD_CLEAR bit to its guests if it doesn't see
NoNonArchitecturalCoreSharing bit.

KVM reports NoNonArchitecturalCoreSharing in KVM_GET_SUPPORTED_HV_CPUID to
indicate that SMT on the host is impossible (not supported of forcefully
disabled).

Implement NoNonArchitecturalCoreSharing support in QEMU as tristate:
'off' - the feature is disabled (default)
'on' - the feature is enabled. This is only safe if vCPUS are properly
 pinned and correct topology is exposed. As CPU pinning is done outside
 of QEMU the enablement decision will be made on a higher level.
'auto' - copy KVM setting. As during live migration SMT settings on the
source and destination host may differ this requires us to add a migration
blocker.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20191018163908.10246-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22 09:38:42 +02:00
Mario Smarduch 73284563dc target/i386: log MCE guest and host addresses
Patch logs MCE AO, AR messages injected to guest or taken by QEMU itself.
We print the QEMU address for guest MCEs, helps on hypervisors that have
another source of MCE logging like mce log, and when they go missing.

For example we found these QEMU logs:

September 26th 2019, 17:36:02.309        Droplet-153258224: Guest MCE Memory Error at qemu addr 0x7f8ce14f5000 and guest 3d6f5000 addr of type BUS_MCEERR_AR injected   qemu-system-x86_64      amsN    ams3nodeNNNN

September 27th 2019, 06:25:03.234        Droplet-153258224: Guest MCE Memory Error at qemu addr 0x7f8ce14f5000 and guest 3d6f5000 addr of type BUS_MCEERR_AR injected   qemu-system-x86_64      amsN    ams3nodeNNNN

The first log had a corresponding mce log entry, the second didnt (logging
thresholds) we can infer from second entry same PA and mce type.

Signed-off-by: Mario Smarduch <msmarduch@digitalocean.com>
Message-Id: <20191009164459.8209-3-msmarduch@digitalocean.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22 09:38:38 +02:00
Xiaoyao Li 69edb0f37a target/i386: Add Snowridge-v2 (no MPX) CPU model
Add new version of Snowridge CPU model that removes MPX feature.

MPX support is being phased out by Intel. GCC has dropped it, Linux kernel
and KVM are also going to do that in the future.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20191012024748.127135-1-xiaoyao.li@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15 18:34:44 -03:00
Eduardo Habkost af95cafb87 i386: Omit all-zeroes entries from KVM CPUID table
KVM has a 80-entry limit at KVM_SET_CPUID2.  With the
introduction of CPUID[0x1F], it is now possible to hit this limit
with unusual CPU configurations, e.g.:

  $ ./x86_64-softmmu/qemu-system-x86_64 \
    -smp 1,dies=2,maxcpus=2 \
    -cpu EPYC,check=off,enforce=off \
    -machine accel=kvm
  qemu-system-x86_64: kvm_init_vcpu failed: Argument list too long

This happens because QEMU adds a lot of all-zeroes CPUID entries
for unused CPUID leaves.  In the example above, we end up
creating 48 all-zeroes CPUID entries.

KVM already returns all-zeroes when emulating the CPUID
instruction if an entry is missing, so the all-zeroes entries are
redundant.  Skip those entries.  This reduces the CPUID table
size by half while keeping CPUID output unchanged.

Reported-by: Yumei Huang <yuhuang@redhat.com>
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1741508
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190822225210.32541-1-ehabkost@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15 18:34:44 -03:00
Bingsong Si 76ecd7a514 i386: Fix legacy guest with xsave panic on host kvm without update cpuid.
without kvm commit 412a3c41, CPUID(EAX=0xd,ECX=0).EBX always equal to 0 even
through guest update xcr0, this will crash legacy guest(e.g., CentOS 6).
Below is the call trace on the guest.

[    0.000000] kernel BUG at mm/bootmem.c:469!
[    0.000000] invalid opcode: 0000 [#1] SMP
[    0.000000] last sysfs file:
[    0.000000] CPU 0
[    0.000000] Modules linked in:
[    0.000000]
[    0.000000] Pid: 0, comm: swapper Tainted: G           --------------- H  2.6.32-279#2 Red Hat KVM
[    0.000000] RIP: 0010:[<ffffffff81c4edc4>]  [<ffffffff81c4edc4>] alloc_bootmem_core+0x7b/0x29e
[    0.000000] RSP: 0018:ffffffff81a01cd8  EFLAGS: 00010046
[    0.000000] RAX: ffffffff81cb1748 RBX: ffffffff81cb1720 RCX: 0000000001000000
[    0.000000] RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffffffff81cb1720
[    0.000000] RBP: ffffffff81a01d38 R08: 0000000000000000 R09: 0000000000001000
[    0.000000] R10: 02008921da802087 R11: 00000000ffff8800 R12: 0000000000000000
[    0.000000] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000001000000
[    0.000000] FS:  0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
[    0.000000] CS:  0010 DS: 0018 ES: 0018 CR0: 0000000080050033
[    0.000000] CR2: 0000000000000000 CR3: 0000000001a85000 CR4: 00000000001406b0
[    0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.000000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    0.000000] Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020)
[    0.000000] Stack:
[    0.000000]  0000000000000002 81a01dd881eaf060 000000007e5fe227 0000000000001001
[    0.000000] <d> 0000000000000040 0000000000000001 0000006cffffffff 0000000001000000
[    0.000000] <d> ffffffff81cb1720 0000000000000000 0000000000000000 0000000000000000
[    0.000000] Call Trace:
[    0.000000]  [<ffffffff81c4f074>] ___alloc_bootmem_nopanic+0x8d/0xca
[    0.000000]  [<ffffffff81c4f0cf>] ___alloc_bootmem+0x11/0x39
[    0.000000]  [<ffffffff81c4f172>] __alloc_bootmem+0xb/0xd
[    0.000000]  [<ffffffff814d42d9>] xsave_cntxt_init+0x249/0x2c0
[    0.000000]  [<ffffffff814e0689>] init_thread_xstate+0x17/0x25
[    0.000000]  [<ffffffff814e0710>] fpu_init+0x79/0xaa
[    0.000000]  [<ffffffff814e27e3>] cpu_init+0x301/0x344
[    0.000000]  [<ffffffff81276395>] ? sort+0x155/0x230
[    0.000000]  [<ffffffff81c30cf2>] trap_init+0x24e/0x25f
[    0.000000]  [<ffffffff81c2bd73>] start_kernel+0x21c/0x430
[    0.000000]  [<ffffffff81c2b33a>] x86_64_start_reservations+0x125/0x129
[    0.000000]  [<ffffffff81c2b438>] x86_64_start_kernel+0xfa/0x109
[    0.000000] Code: 03 48 89 f1 49 c1 e8 0c 48 0f af d0 48 c7 c6 00 a6 61 81 48 c7 c7 00 e5 79 81 31 c0 4c 89 74 24 08 e8 f2 d7 89 ff 4d 85 e4 75 04 <0f> 0b eb fe 48 8b 45 c0 48 83 e8 01 48 85 45
c0 74 04 0f 0b eb

Signed-off-by: Bingsong Si <owen.si@ucloud.cn>
Message-Id: <20190822042901.16858-1-owen.si@ucloud.cn>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15 18:34:44 -03:00
Tao Xu e7694a5eae target/i386: drop the duplicated definition of cpuid AVX512_VBMI macro
Drop the duplicated definition of cpuid AVX512_VBMI macro and rename
it as CPUID_7_0_ECX_AVX512_VBMI. Rename CPUID_7_0_ECX_VBMI2 as
CPUID_7_0_ECX_AVX512_VBMI2.

Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190926021055.6970-3-tao3.xu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15 18:34:44 -03:00
Tao Xu f2be0bebb6 target/i386: clean up comments over 80 chars per line
Add some comments, clean up comments over 80 chars per line. And there
is an extra line in comment of CPUID_8000_0008_EBX_WBNOINVD, remove
the extra enter and spaces.

Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190926021055.6970-2-tao3.xu@intel.com>
[ehabkost: rebase to latest git master]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15 18:34:33 -03:00
Thomas Huth a1834d975f target/i386/kvm: Silence warning from Valgrind about uninitialized bytes
When I run QEMU with KVM under Valgrind, I currently get this warning:

 Syscall param ioctl(generic) points to uninitialised byte(s)
    at 0x95BA45B: ioctl (in /usr/lib64/libc-2.28.so)
    by 0x429DC3: kvm_ioctl (kvm-all.c:2365)
    by 0x51B249: kvm_arch_get_supported_msr_feature (kvm.c:469)
    by 0x4C2A49: x86_cpu_get_supported_feature_word (cpu.c:3765)
    by 0x4C4116: x86_cpu_expand_features (cpu.c:5065)
    by 0x4C7F8D: x86_cpu_realizefn (cpu.c:5242)
    by 0x5961F3: device_set_realized (qdev.c:835)
    by 0x7038F6: property_set_bool (object.c:2080)
    by 0x707EFE: object_property_set_qobject (qom-qobject.c:26)
    by 0x705814: object_property_set_bool (object.c:1338)
    by 0x498435: pc_new_cpu (pc.c:1549)
    by 0x49C67D: pc_cpus_init (pc.c:1681)
  Address 0x1ffeffee74 is on thread 1's stack
  in frame #2, created by kvm_arch_get_supported_msr_feature (kvm.c:445)

It's harmless, but a little bit annoying, so silence it by properly
initializing the whole structure with zeroes.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:20 +02:00
Paolo Bonzini 048c95163b target/i386: work around KVM_GET_MSRS bug for secondary execution controls
Some secondary controls are automatically enabled/disabled based on the CPUID
values that are set for the guest.  However, they are still available at a
global level and therefore should be present when KVM_GET_MSRS is sent to
/dev/kvm.

Unfortunately KVM forgot to include those, so fix that.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:20 +02:00
Paolo Bonzini 20a78b02d3 target/i386: add VMX features
Add code to convert the VMX feature words back into MSR values,
allowing the user to enable/disable VMX features as they wish.  The same
infrastructure enables support for limiting VMX features in named
CPU models.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:20 +02:00
Paolo Bonzini 704798add8 target/i386: add VMX definitions
These will be used to compile the list of VMX features for named
CPU models, and/or by the code that sets up the VMX MSRs.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:19 +02:00
Paolo Bonzini ede146c2e7 target/i386: expand feature words to 64 bits
VMX requires 64-bit feature words for the IA32_VMX_EPT_VPID_CAP
and IA32_VMX_BASIC MSRs.  (The VMX control MSRs are 64-bit wide but
actually have only 32 bits of information).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:19 +02:00
Paolo Bonzini 99e24dbdaa target/i386: introduce generic feature dependency mechanism
Sometimes a CPU feature does not make sense unless another is
present.  In the case of VMX features, KVM does not even allow
setting the VMX controls to some invalid combinations.

Therefore, this patch adds a generic mechanism that looks for bits
that the user explicitly cleared, and uses them to remove other bits
from the expanded CPU definition.  If these dependent bits were also
explicitly *set* by the user, this will be a warning for "-cpu check"
and an error for "-cpu enforce".  If not, then the dependent bits are
cleared silently, for convenience.

With VMX features, this will be used so that for example
"-cpu host,-rdrand" will also hide support for RDRAND exiting.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:19 +02:00
Paolo Bonzini 245edd0cfb target/i386: handle filtered_features in a new function mark_unavailable_features
The next patch will add a different reason for filtering features, unrelated
to host feature support.  Extract a new function that takes care of disabling
the features and optionally reporting them.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:19 +02:00
Dmitry Poletaev 56f997500a Fix wrong behavior of cpu_memory_rw_debug() function in SMM
There is a problem, that you don't have access to the data using cpu_memory_rw_debug() function when in SMM. You can't remotely debug SMM mode program because of that for example.
Likely attrs version of get_phys_page_debug should be used to get correct asidx at the end to handle access properly.
Here the patch to fix it.

Signed-off-by: Dmitry Poletaev <poletaev@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:18 +02:00
Sebastian Andrzej Siewior e900135dcf i386: Add CPUID bit for CLZERO and XSAVEERPTR
The CPUID bits CLZERO and XSAVEERPTR are availble on AMD's ZEN platform
and could be passed to the guest.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:17 +02:00
Philippe Mathieu-Daudé 754f287176 target/i386: Fix broken build with WHPX enabled
The WHPX build is broken since commit 12e9493df9 which removed the
"hw/boards.h" where MachineState is declared:

  $ ./configure \
     --enable-hax --enable-whpx

  $ make x86_64-softmmu/all
  [...]
    CC      x86_64-softmmu/target/i386/whpx-all.o
  target/i386/whpx-all.c: In function 'whpx_accel_init':
  target/i386/whpx-all.c:1378:25: error: dereferencing pointer to
  incomplete type 'MachineState' {aka 'struct MachineState'}
       whpx->mem_quota = ms->ram_size;
                           ^~
  make[1]: *** [rules.mak:69: target/i386/whpx-all.o] Error 1
    CC      x86_64-softmmu/trace/generated-helpers.o
  make[1]: Target 'all' not remade because of errors.
  make: *** [Makefile:471: x86_64-softmmu/all] Error 2

Restore this header, partially reverting commit 12e9493df9.

Fixes: 12e9493df9
Reported-by: Ilias Maratos <i.maratos@gmail.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20190920113329.16787-2-philmd@redhat.com>
2019-09-26 19:00:53 +01:00
Philippe Mathieu-Daudé d6d059ca07 hw/i386/pc: Extract e820 memory layout code
Suggested-by: Samuel Ortiz <sameo@linux.intel.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190818225414.22590-3-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-16 17:13:07 +02:00
Wanpeng Li d38d201f0e i386/kvm: support guest access CORE cstate
Allow guest reads CORE cstate when exposing host CPU power management capabilities
to the guest. PKG cstate is restricted to avoid a guest to get the whole package
information in multi-tenant scenario.

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1563154124-18579-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-09-16 12:32:20 +02:00
Tony Nguyen 14776ab5a1 tcg: TCGMemOp is now accelerator independent MemOp
Preparation for collapsing the two byte swaps, adjust_endianness and
handle_bswap, along the I/O path.

Target dependant attributes are conditionalized upon NEED_CPU_H.

Signed-off-by: Tony Nguyen <tony.nguyen@bt.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <81d9cd7d7f5aaadfa772d6c48ecee834e9cf7882.1566466906.git.tony.nguyen@bt.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2019-09-03 08:30:38 -07:00
Peter Maydell f3b8f18ebf Monitor patches for 2019-08-21
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAl1dZKsSHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZTJ4QP/10izA+dSofQ9404GRq3TNzwRCKugU44
 nES9CqDh6x5emx+ADQWYkugblgfH9GOvUaAUNtY+uFaEr55yC/F+VWeVXvyjt5U6
 ZpPZqIRDOHo2+PZrddr/KcKmiomS6plz03m9bzb3pYN1yIl2ZzgClAhAqWQLk0WB
 wwiY+YsJ83YR4sdiRMZkuF+UL7N8fSqYvIIj0yzM8+8ONDor9n16PoPeFg3JSsyG
 aMxXIUnSBZAVtClaNkUPtS0Wf9XEuqoG1rvMRV4Vv+eeb7fwA414DqanRJdLlGMA
 yNRtFcVyztCfjgVEXnY9JJlFe6pDkoe8ycoimQ4YA60C9c1DIMHqyjFWXRHfDwk8
 bYMSX6CTpfoEvbTfmwqYR6KSkb/KuXiFDmcYlTYFvIt3grhhdHQbru9vy+E5sm/b
 j3CPV2DTCkeGY+oZFfKIaQT9yoWZOhmMY5doMTYyinXygPTGQROUrHtzUeRXKmJZ
 arqDRmh+mlEiGETNeYQCI45eYCSDYxO+UNrhszxhmv6B1+ixhIrV2oXhi61vVBeY
 yngY4EILbuA2Z/E4BevJk91ESWJTr3UP13c6p7yf21iN4BD1KkHy5HoXCgYfQDeV
 4kar49g6WQ/VQEiwhi65Xd0OwstynkcV69F+kMagVMgaLeRsdU5ikGJQzxTeWJRl
 SPpc7oDwuAS+
 =2F3E
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2019-08-21' into staging

Monitor patches for 2019-08-21

# gpg: Signature made Wed 21 Aug 2019 16:35:07 BST
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-monitor-2019-08-21:
  monitor/qmp: Update comment for commit 4eaca8de26
  qdev: Collect HMP handlers command handlers in qdev-monitor.c
  qapi: Move query-target from misc.json to machine.json
  hw/core: Move cpu.c, cpu.h from qom/ to hw/core/

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-08-22 10:31:21 +01:00
Markus Armbruster 2e5b09fd0e hw/core: Move cpu.c, cpu.h from qom/ to hw/core/
Suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190709152053.16670-2-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[Rebased onto merge commit 95a9457fd44; missed instances of qom/cpu.h
in comments replaced]
2019-08-21 13:24:01 +02:00
Jing Liu 80db491da4 x86: Intel AVX512_BF16 feature enabling
Intel CooperLake cpu adds AVX512_BF16 instruction, defining as
CPUID.(EAX=7,ECX=1):EAX[bit 05].

The patch adds a property for setting the subleaf of CPUID leaf 7 in
case that people would like to specify it.

The release spec link as follows,
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf

Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20 20:00:52 +02:00
Pavel Dovgalyuk 9e9b10c649 icount: remove unnecessary gen_io_end calls
Prior patch resets can_do_io flag at the TB entry. Therefore there is no
need in resetting this flag at the end of the block.
This patch removes redundant gen_io_end calls.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <156404429499.18669.13404064982854123855.stgit@pasha-Precision-3630-Tower>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@gmail.com>
2019-08-20 17:26:22 +02:00
Peter Maydell 1e8a98b538 target/i386: Return 'indefinite integer value' for invalid SSE fp->int conversions
The x86 architecture requires that all conversions from floating
point to integer which raise the 'invalid' exception (infinities of
both signs, NaN, and all values which don't fit in the destination
integer) return what the x86 spec calls the "indefinite integer
value", which is 0x8000_0000 for 32-bits or 0x8000_0000_0000_0000 for
64-bits.  The softfloat functions return the more usual behaviour of
positive overflows returning the maximum value that fits in the
destination integer format and negative overflows returning the
minimum value that fits.

Wrap the softfloat functions in x86-specific versions which
detect the 'invalid' condition and return the indefinite integer.

Note that we don't use these wrappers for the 3DNow! pf2id and pf2iw
instructions, which do return the minimum value that fits in
an int32 if the input float is a large negative number.

Fixes: https://bugs.launchpad.net/qemu/+bug/1815423
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20190805180332.10185-1-peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20 17:26:20 +02:00
Andrey Shinkevich 1f670a95b3 i386/kvm: initialize struct at full before ioctl call
Not the whole structure is initialized before passing it to the KVM.
Reduce the number of Valgrind reports.

Signed-off-by: Andrey Shinkevich <andrey.shinkevich@virtuozzo.com>
Message-Id: <1564502498-805893-4-git-send-email-andrey.shinkevich@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20 17:26:19 +02:00
Li Qiang de428cead6 target-i386: kvm: 'kvm_get_supported_msrs' cleanup
Function 'kvm_get_supported_msrs' is only called once
now, get rid of the static variable 'kvm_supported_msrs'.

Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20190725151639.21693-1-liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20 17:26:19 +02:00
Wanpeng Li b896c4b50d target-i386: adds PV_SCHED_YIELD CPUID feature bit
Adds PV_SCHED_YIELD CPUID feature bit.

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1562745771-8414-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20 17:26:18 +02:00
Marcelo Tosatti d645e13287 kvm: i386: halt poll control MSR support
Add support for halt poll control MSR: save/restore, migration
and new feature name.

The purpose of this MSR is to allow the guest to disable
host halt poll.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20190603230408.GA7938@amt.cnet>
[Do not enable by default, as pointed out by Mark Kanda. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20 17:26:17 +02:00
Markus Armbruster 54d31236b9 sysemu: Split sysemu/runstate.h off sysemu/sysemu.h
sysemu/sysemu.h is a rather unfocused dumping ground for stuff related
to the system-emulator.  Evidence:

* It's included widely: in my "build everything" tree, changing
  sysemu/sysemu.h still triggers a recompile of some 1100 out of 6600
  objects (not counting tests and objects that don't depend on
  qemu/osdep.h, down from 5400 due to the previous two commits).

* It pulls in more than a dozen additional headers.

Split stuff related to run state management into its own header
sysemu/runstate.h.

Touching sysemu/sysemu.h now recompiles some 850 objects.  qemu/uuid.h
also drops from 1100 to 850, and qapi/qapi-types-run-state.h from 4400
to 4200.  Touching new sysemu/runstate.h recompiles some 500 objects.

Since I'm touching MAINTAINERS to add sysemu/runstate.h anyway, also
add qemu/main-loop.h.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-30-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
[Unbreak OS-X build]
2019-08-16 13:37:36 +02:00
Markus Armbruster d5938f29fe Clean up inclusion of sysemu/sysemu.h
In my "build everything" tree, changing sysemu/sysemu.h triggers a
recompile of some 5400 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

Almost a third of its inclusions are actually superfluous.  Delete
them.  Downgrade two more to qapi/qapi-types-run-state.h, and move one
from char/serial.h to char/serial.c.

hw/semihosting/config.c, monitor/monitor.c, qdev-monitor.c, and
stubs/semihost.c define variables declared in sysemu/sysemu.h without
including it.  The compiler is cool with that, but include it anyway.

This doesn't reduce actual use much, as it's still included into
widely included headers.  The next commit will tackle that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-27-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2019-08-16 13:31:53 +02:00
Markus Armbruster 12e9493df9 Include hw/boards.h a bit less
hw/boards.h pulls in almost 60 headers.  The less we include it into
headers, the better.  As a first step, drop superfluous inclusions,
and downgrade some more to what's actually needed.  Gets rid of just
one inclusion into a header.

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-23-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
2019-08-16 13:31:53 +02:00
Markus Armbruster db72581598 Include qemu/main-loop.h less
In my "build everything" tree, changing qemu/main-loop.h triggers a
recompile of some 5600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).  It includes block/aio.h,
which in turn includes qemu/event_notifier.h, qemu/notify.h,
qemu/processor.h, qemu/qsp.h, qemu/queue.h, qemu/thread-posix.h,
qemu/thread.h, qemu/timer.h, and a few more.

Include qemu/main-loop.h only where it's needed.  Touching it now
recompiles only some 1700 objects.  For block/aio.h and
qemu/event_notifier.h, these numbers drop from 5600 to 2800.  For the
others, they shrink only slightly.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190812052359.30071-21-armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster dc5e9ac716 Include qemu/queue.h slightly less
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-20-armbru@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster 650d103d3e Include hw/hw.h exactly where needed
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).

The previous commits have left only the declaration of hw_error() in
hw/hw.h.  This permits dropping most of its inclusions.  Touching it
now recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster 8a9358cc6e migration: Move the VMStateDescription typedef to typedefs.h
We declare incomplete struct VMStateDescription in a couple of places
so we don't have to include migration/vmstate.h for the typedef.
That's fine with me.  However, the next commit will drop
migration/vmstate.h from a massive number of compiles.  Move the
typedef to qemu/typedefs.h now, so I don't have to insert struct in
front of VMStateDescription all over the place then.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-15-armbru@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster 71e8a91585 Include sysemu/reset.h a lot less
In my "build everything" tree, changing sysemu/reset.h triggers a
recompile of some 2600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

The main culprit is hw/hw.h, which supposedly includes it for
convenience.

Include sysemu/reset.h only where it's needed.  Touching it now
recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-9-armbru@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster 2ae16a6aa4 Include generated QAPI headers less
Some of the generated qapi-types-MODULE.h are included all over the
place.  Changing a QAPI type can trigger massive recompiling.  Top
scorers recompile more than 1000 out of some 6600 objects (not
counting tests and objects that don't depend on qemu/osdep.h):

    6300 qapi/qapi-builtin-types.h
    5700 qapi/qapi-types-run-state.h
    3900 qapi/qapi-types-common.h
    3300 qapi/qapi-types-sockets.h
    3000 qapi/qapi-types-misc.h
    3000 qapi/qapi-types-crypto.h
    3000 qapi/qapi-types-job.h
    3000 qapi/qapi-types-block-core.h
    2800 qapi/qapi-types-block.h
    1300 qapi/qapi-types-net.h

Clean up headers to include generated QAPI headers only where needed.
Impact is negligible except for hw/qdev-properties.h.

This header includes qapi/qapi-types-block.h and
qapi/qapi-types-misc.h.  They are used only in expansions of property
definition macros such as DEFINE_PROP_BLOCKDEV_ON_ERROR() and
DEFINE_PROP_OFF_AUTO().  Moving their inclusion from
hw/qdev-properties.h to the users of these macros avoids pointless
recompiles.  This is how other property definition macros, such as
DEFINE_PROP_NETDEV(), already work.

Improves things for some of the top scorers:

    3600 qapi/qapi-types-common.h
    2800 qapi/qapi-types-sockets.h
     900 qapi/qapi-types-misc.h
    2200 qapi/qapi-types-crypto.h
    2100 qapi/qapi-types-job.h
    2100 qapi/qapi-types-block-core.h
     270 qapi/qapi-types-block.h

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-3-armbru@redhat.com>
2019-08-16 13:31:51 +02:00
Paul Lai ff656fcd33 i386: Fix Snowridge CPU model name and features
Changing the name to Snowridge from SnowRidge-Server.
There is no client model of Snowridge, so "-Server" is unnecessary.

Removing CPUID_EXT_VMX from Snowridge cpu feature list.

Signed-off-by: Paul Lai <paul.c.lai@intel.com>
Tested-by: Tao3 Xu <tao3.xu@intel.com>
Message-Id: <20190716155808.25010-1-paul.c.lai@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-29 13:08:02 -03:00
Jan Kiszka bec7156a45 i386/kvm: Do not sync nested state during runtime
Writing the nested state e.g. after a vmport access can invalidate
important parts of the kernel-internal state, and it is not needed as
well. So leave this out from KVM_PUT_RUNTIME_STATE.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <bdd53f40-4e60-f3ae-7ec6-162198214953@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-24 11:21:59 +02:00
Jiri Slaby d4b976c0a8 target/i386: sev: fix failed message typos
In these multiline messages, there were typos. Fix them -- add a missing
space and remove a superfluous apostrophe.

Inspired by Tom's patch.

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: qemu-trivial@nongnu.org
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <20190719104118.17735-1-jslaby@suse.cz>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-19 23:45:28 +02:00
Denis V. Lunev 2924ab02c2 i386: indicate that 'pconfig' feature was removed intentionally
pconfig feature was added in 5131dc433d and removed in 712f807e19.
This patch mark this feature as known to QEMU and removed by
intentinally. This follows the convention of 9ccb9784b5 and f1a23522b0
dealing with 'osxsave' and 'ospke'.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190719111222.14943-1-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-19 23:45:28 +02:00
Paolo Bonzini 1e44f3ab71 target/i386: skip KVM_GET/SET_NESTED_STATE if VMX disabled, or for SVM
Do not allocate env->nested_state unless we later need to migrate the
nested virtualization state.

With this change, nested_state_needed() will return false if the
VMX flag is not included in the virtual machine.  KVM_GET/SET_NESTED_STATE
is also disabled for SVM which is safer (we know that at least the NPT
root and paging mode have to be saved/loaded), and thus the corresponding
subsection can go away as well.

Inspired by a patch from Liran Alon.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-19 18:02:22 +02:00
Liran Alon 79a197ab18 target/i386: kvm: Demand nested migration kernel capabilities only when vCPU may have enabled VMX
Previous to this change, a vCPU exposed with VMX running on a kernel
without KVM_CAP_NESTED_STATE or KVM_CAP_EXCEPTION_PAYLOAD resulted in
adding a migration blocker. This was because when the code was written
it was thought there is no way to reliably know if a vCPU is utilising
VMX or not at runtime. However, it turns out that this can be known to
some extent:

In order for a vCPU to enter VMX operation it must have CR4.VMXE set.
Since it was set, CR4.VMXE must remain set as long as the vCPU is in
VMX operation. This is because CR4.VMXE is one of the bits set
in MSR_IA32_VMX_CR4_FIXED1.
There is one exception to the above statement when vCPU enters SMM mode.
When a vCPU enters SMM mode, it temporarily exits VMX operation and
may also reset CR4.VMXE during execution in SMM mode.
When the vCPU exits SMM mode, vCPU state is restored to be in VMX operation
and CR4.VMXE is restored to its original state of being set.
Therefore, when the vCPU is not in SMM mode, we can infer whether
VMX is being used by examining CR4.VMXE. Otherwise, we cannot
know for certain but assume the worse that vCPU may utilise VMX.

Summaring all the above, a vCPU may have enabled VMX in case
CR4.VMXE is set or vCPU is in SMM mode.

Therefore, remove migration blocker and check before migration
(cpu_pre_save()) if the vCPU may have enabled VMX. If true, only then
require relevant kernel capabilities.

While at it, demand KVM_CAP_EXCEPTION_PAYLOAD only when the vCPU is in
guest-mode and there is a pending/injected exception. Otherwise, this
kernel capability is not required for proper migration.

Reviewed-by: Joao Martins <joao.m.martins@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Tested-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-19 18:01:47 +02:00
Alex Williamson 56e2ec9488 target/i386: sev: Do not unpin ram device memory region
The commit referenced below skipped pinning ram device memory when
ram blocks are added, we need to do the same when they're removed.

Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Fixes: cedc0ad539 ("target/i386: sev: Do not pin the ram device memory region")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <156320087103.2556.10983987500488190423.stgit@gimli.home>
Reviewed-by: Singh, Brijesh <brijesh.singh@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-15 20:58:37 +02:00
Stefan Weil f2b143a281 Fix broken build with WHPX enabled
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20190712132611.20411-1-sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-15 11:20:43 +02:00
Peter Maydell c4107e8208 Bugfixes.
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJdH7FgAAoJEL/70l94x66DdUUH+gLr/ZdjLIdfYy9cjcnevf4E
 cJlxdaW9KvUsK2uVgqQ/3b1yF+GCGk10n6n8ZTIbClhs+6NpqEMz5O3FA/Na6FGA
 48M2DwJaJ2H9AG/lQBlSBNUZfLsEJ9rWy7DHvNut5XMJFuWGwdtF/jRhUm3KqaRq
 vAaOgcQHbzHU9W8r1NJ7l6pnPebeO7S0JQV+82T/ITTz2gEBDUkJ36boO6fedkVQ
 jLb9nZyG3CJXHm2WlxGO4hkqbLFzURnCi6imOh2rMdD8BCu1eIVl59tD1lC/A0xv
 Pp3xXnv9SgJXsV4/I/N3/nU85ZhGVMPQZXkxaajHPtJJ0rQq7FAG8PJMEj9yPe8=
 =poke
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

Bugfixes.

# gpg: Signature made Fri 05 Jul 2019 21:21:52 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  ioapic: use irq number instead of vector in ioapic_eoi_broadcast
  hw/i386: Fix linker error when ISAPC is disabled
  Makefile: generate header file with the list of devices enabled
  target/i386: kvm: Fix when nested state is needed for migration
  minikconf: do not include variables from MINIKCONF_ARGS in config-all-devices.mak
  target/i386: fix feature check in hyperv-stub.c
  ioapic: clear irq_eoi when updating the ioapic redirect table entry
  intel_iommu: Fix unexpected unmaps during global unmap
  intel_iommu: Fix incorrect "end" for vtd_address_space_unmap
  i386/kvm: Fix build with -m32
  checkpatch: do not warn for multiline parenthesized returned value
  pc: fix possible NULL pointer dereference in pc_machine_get_device_memory_region_size()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-07-08 10:26:18 +01:00
Liran Alon ec7b1bbd2c target/i386: kvm: Fix when nested state is needed for migration
When vCPU is in VMX operation and enters SMM mode,
it temporarily exits VMX operation but KVM maintained nested-state
still stores the VMXON region physical address, i.e. even when the
vCPU is in SMM mode then (nested_state->hdr.vmx.vmxon_pa != -1ull).

Therefore, there is no need to explicitly check for
KVM_STATE_NESTED_SMM_VMXON to determine if it is necessary
to save nested-state as part of migration stream.

Reviewed-by: Karl Heubaum <karl.heubaum@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190624230514.53326-1-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-05 22:16:46 +02:00
Alex Bennée 4b03403f76 target/i386: fix feature check in hyperv-stub.c
Commit 2d384d7c8 broken the build when built with:

  configure --without-default-devices --disable-user

The reason was the conversion of cpu->hyperv_synic to
cpu->hyperv_synic_kvm_only although the rest of the patch introduces a
feature checking mechanism. So I've fixed the KVM_EXIT_HYPERV_SYNIC in
hyperv-stub to do the same feature check as in the real hyperv.c

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20190624123835.28869-1-alex.bennee@linaro.org>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-05 22:16:46 +02:00
Max Reitz 9dc83cd9c3 i386/kvm: Fix build with -m32
find_next_bit() takes a pointer of type "const unsigned long *", but the
first argument passed here is a "uint64_t *".  These types are
incompatible when compiling qemu with -m32.

Just use ctz64() instead.

Fixes: c686193072
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190624193913.28343-1-mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-05 22:16:46 +02:00
Eduardo Habkost fd63c6d1a5 i386: Add Cascadelake-Server-v2 CPU model
Add new version of Cascadelake-Server CPU model, setting
stepping=5 and enabling the IA32_ARCH_CAPABILITIES MSR
with some flags.

The new feature will introduce a new host software requirement,
breaking our CPU model runnability promises.  This means we can't
enable the new CPU model version by default in QEMU 4.1, because
management software isn't ready yet to resolve CPU model aliases.
This is why "pc-*-4.1" will keep returning Cascadelake-Server-v1
if "-cpu Cascadelake-Server" is specified.

Includes a test case to ensure the right combinations of
machine-type + CPU model + command-line feature flags will work
as expected.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-10-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190703221723.8161-1-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:12:30 -03:00
Eduardo Habkost 0788a56bd1 i386: Make unversioned CPU models be aliases
This will make unversioned CPU models behavior depend on the
machine type:

* "pc-*-4.0" and older will not report them as aliases.
  This is done to keep compatibility with older QEMU versions
  after management software starts translating aliases.

* "pc-*-4.1" will translate unversioned CPU models to -v1.
  This is done to keep compatibility with existing management
  software, that still relies on CPU model runnability promises.

* "none" will translate unversioned CPU models to their latest
  version.  This is planned become the default in future machine
  types (probably in pc-*-4.3).

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-8-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Eduardo Habkost 53db89d93b i386: Replace -noTSX, -IBRS, -IBPB CPU models with aliases
The old CPU models will be just aliases for specific versions of
the original CPU models.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-7-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Eduardo Habkost d86a708815 i386: Define -IBRS, -noTSX, -IBRS versions of CPU models
Add versions of CPU models that are equivalent to their -IBRS,
-noTSX and -IBRS variants.

The separate variants will eventually be removed and become
aliases for these CPU versions.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-6-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Eduardo Habkost dcafd1ef0a i386: Register versioned CPU models
Add support for registration of multiple versions of CPU models.

The existing CPU models will be registered with a "-v1" suffix.

The -noTSX, -IBRS, and -IBPB CPU model variants will become
versions of the original models in a separate patch, so
make sure we register no versions for them.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-5-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Eduardo Habkost 164e779ce1 i386: Get model-id from CPU object on "-cpu help"
When introducing versioned CPU models, the string at
X86CPUDefinition::model_id might not be the model-id we'll really
use.  Instantiate a CPU object and check the model-id property on
"-cpu help"

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-4-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Eduardo Habkost dac1deae65 i386: Add x-force-features option for testing
Add a new option that can be used to disable feature flag
filtering.  This will allow CPU model compatibility test cases to
work without host hardware dependencies.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-3-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Paul Lai 0b18874bd2 i386: Introduce SnowRidge CPU model
SnowRidge CPU supports Accelerator Infrastrcture Architecture (MOVDIRI,
MOVDIR64B), CLDEMOTE and SPLIT_LOCK_DISABLE.

MOVDIRI, MOVDIR64B, and CLDEMOTE are found via CPUID.
The availability of SPLIT_LOCK_DISABLE is check via msr access

References can be found in either:
 https://software.intel.com/en-us/articles/intel-sdm
 https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-and-future-features-programming-reference

Signed-off-by: Paul Lai <paul.c.lai@intel.com>
Tested-by: Tao3 Xu <tao3.xu@intel.com>
Message-Id: <20190626162129.25345-1-paul.c.lai@intel.com>
[ehabkost: squashed SPLIT_LOCK_DETECT patch]
Message-Id: <20190626163232.25711-1-paul.c.lai@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Like Xu a94e142899 target/i386: Add CPUID.1F generation support for multi-dies PCMachine
The CPUID.1F as Intel V2 Extended Topology Enumeration Leaf would be
exposed if guests want to emulate multiple software-visible die within
each package. Per Intel's SDM, the 0x1f is a superset of 0xb, thus they
can be generated by almost same code as 0xb except die_offset setting.

If the number of dies per package is greater than 1, the cpuid_min_level
would be adjusted to 0x1f regardless of whether the host supports CPUID.1F.
Likewise, the CPUID.1F wouldn't be exposed if env->nr_dies < 2.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190620054525.37188-2-like.xu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Eduardo Habkost 1c809535e3 i386: Remove unused host_cpudef variable
The variable is completely unused, probably a leftover from
previous code clean up.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190625050008.12789-3-ehabkost@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Wei Yang f69ecddb4a x86/cpu: use FeatureWordArray to define filtered_features
Use the same definition as features/user_features in CPUX86State.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20190620023746.9869-1-richardw.yang@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Roman Kagan 915aee93e7 i386: make 'hv-spinlocks' a regular uint32 property
X86CPU.hv-spinlocks is a uint32 property that has a special setter
validating the value to be no less than 0xFFF and no bigger than
UINT_MAX.  The latter check is redundant; as for the former, there
appears to be no reason to prohibit the user from setting it to a lower
value.

So nuke the dedicated getter/setter pair and convert 'hv-spinlocks' to a
regular uint32 property.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20190618110659.14744-1-rkagan@virtuozzo.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Eduardo Habkost 4f2beda453 i386: Fix signedness of hyperv_spinlock_attempts
The current default value for hv-spinlocks is 0xFFFFFFFF (meaning
"never retry").  However, the value is stored as a signed
integer, making the getter of the hv-spinlocks QOM property
return -1 instead of 0xFFFFFFFF.

Fix this by changing the type of X86CPU::hyperv_spinlock_attempts
to uint32_t.  This has no visible effect to guest operating
systems, affecting just the behavior of the QOM getter.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190615200505.31348-1-ehabkost@redhat.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Eduardo Habkost fea306520e i386: Don't print warning if phys-bits was set automatically
If cpu->host_phys_bits_limit is set, QEMU will make
cpu->phys_bits be lower than host_phys_bits on some cases.  This
triggers a warning that was supposed to be printed only if
phys-bits was explicitly set in the command-line.

Reorder the code so the value of cpu->phys_bits is validated
before the cpu->host_phys_bits handling.  This will avoid
unexpected warnings when cpu->host_phys_bits_limit is set.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190611205420.20286-1-ehabkost@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Like Xu d65af288a8 i386: Update new x86_apicid parsing rules with die_offset support
In new sockets/dies/cores/threads model, the apicid of logical cpu could
imply die level info of guest cpu topology thus x86_apicid_from_cpu_idx()
need to be refactored with #dies value, so does apicid_*_offset().

To keep semantic compatibility, the legacy pkg_offset which helps to
generate CPUIDs such as 0x3 for L3 cache should be mapping to die_offset.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190612084104.34984-5-like.xu@linux.intel.com>
[ehabkost: squash unit test patch]
Message-Id: <20190612084104.34984-6-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Like Xu 176d2cda0d i386/cpu: Consolidate die-id validity in smp context
The field die_id (default as 0) and has_die_id are introduced to X86CPU.
Following the legacy smp check rules, the die_id validity is added to
the same contexts as leagcy smp variables such as hmp_hotpluggable_cpus(),
machine_set_cpu_numa_node(), cpu_slot_to_string() and pc_cpu_pre_plug().

Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190612084104.34984-4-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Like Xu c26ae61081 i386: Add die-level cpu topology to x86CPU on PCMachine
The die-level as the first PC-specific cpu topology is added to the leagcy
cpu topology model, which has one die per package implicitly and only the
numbers of sockets/cores/threads are configurable.

In the new model with die-level support, the total number of logical
processors (including offline) on board will be calculated as:

     #cpus = #sockets * #dies * #cores * #threads

and considering compatibility, the default value for #dies would be
initialized to one in x86_cpu_initfn() and pc_machine_initfn().

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190612084104.34984-2-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00