Add kvm_get_supported_feature_msrs() to get supported MSR feature index list.
Add kvm_arch_get_supported_msr_feature() to get each MSR features value.
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1539578845-37944-2-git-send-email-robert.hu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Remove a debugging hack which could be used to cause the
undocumented 'icebp' instruction to enable QEMU internal
debug logging. This code has always been #ifdeffed out
since it was introduced in commit aba9d61e34 in 2005;
judging by the rest of that commit (which is entirely
unrelated) it may have even been committed by accident.
(Note that WANT_ICEBP is not defined by default anyway.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20181009183314.13416-1-peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Intel SDM says for CPUID function 0DH, sub-function 0:
| • ECX enumerates the size (in bytes) required by the XSAVE instruction for an
| XSAVE area containing all the user state components supported by this
| processor.
| • EBX enumerates the size (in bytes) required by the XSAVE instruction for an
| XSAVE area containing all the user state components corresponding to bits
| currently set in XCR0.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Message-Id: <20180928104319.3296-1-bigeasy@linutronix.de>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbzcCHAAoJEDhwtADrkYZT3YsP/2qE4HNY/htj3IP6vNJuSaqw
CLPRTz7zWmUBTE6FqSkvLsq3X2BMFFLeaIPA9EFcbyn2km6qPqBYgg9ElXXvPZBm
6hDeRIoC8FdRD0Apozd5MGC94/lE47PheDRV8V+4KrGLaaMXEPxMZ0wP4AfdS5pS
6Pt2xuF7nPu1+OWVxMk0fXadGjGLEuOQQmTh3B21J5RaynQ3gtd6h7XFC/LJyOGG
LC/6GyPc0h7KU83VnvrRjH/EOpu1wENgrsvWsS0sem8op35Z+i9jU5BfCp4qFkDy
gCHHUEyEeyexS+W+Tj87eBtK2gfrqQx9ovo8CIsWcUwpKbdD6AMK4FKGsDNMNHab
Kg5u/M+O8nHCB7DuursF+3mqEbZHb05cfKe6JEtiq49EuORMV5hp4Ap966noSwTw
UEU0NJNA1p8EdmXVudyyyYR7wpoSSmZpoenA+bJ3nthK8K0KcU4RUGk6ZEbxfJy+
7ENl+3R2IxmxzgXv/x0tz0uFisaVW1rltTXtMte+ElQsO0qy74iHdfR7JHsmLxj9
CO/ABMVoYsWq2OJv8pWLrdKpT4v3HQLJdHhknyu0ZcJGDyICqX29ULLEhPrNEZvW
rxVxAkiemlaqxlUjbrM46CDQQm+w03OCnk7aCYcV4oK+u5+o3mCag705gMPErapZ
6uOE3fAjiWw43sA31mek
=kPZX
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2018-10-22' into staging
Error reporting patches for 2018-10-22
# gpg: Signature made Mon 22 Oct 2018 13:20:23 BST
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-error-2018-10-22: (40 commits)
error: Drop bogus "use error_setg() instead" admonitions
vpc: Fail open on bad header checksum
block: Clean up bdrv_img_create()'s error reporting
vl: Simplify call of parse_name()
vl: Fix exit status for -drive format=help
blockdev: Convert drive_new() to Error
vl: Assert drive_new() does not fail in default_drive()
fsdev: Clean up error reporting in qemu_fsdev_add()
spice: Clean up error reporting in add_channel()
tpm: Clean up error reporting in tpm_init_tpmdev()
numa: Clean up error reporting in parse_numa()
vnc: Clean up error reporting in vnc_init_func()
ui: Convert vnc_display_init(), init_keyboard_layout() to Error
ui/keymaps: Fix handling of erroneous include files
vl: Clean up error reporting in device_init_func()
vl: Clean up error reporting in parse_fw_cfg()
vl: Clean up error reporting in mon_init_func()
vl: Clean up error reporting in machine_set_property()
vl: Clean up error reporting in chardev_init_func()
qom: Clean up error reporting in user_creatable_add_opts_foreach()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Calling error_report() in a function that takes an Error ** argument
is suspicious. Convert a few that are actually warnings to
warn_report().
While there, split a warning consisting of multiple sentences to
conform to conventions spelled out in warn_report()'s contract.
Cc: Alex Bennée <alex.bennee@linaro.org>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Fam Zheng <famz@redhat.com>
Cc: Wei Huang <wei@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20181017082702.5581-5-armbru@redhat.com>
When migrate_add_blocker failed, the invtsc_mig_blocker is not
appended so no need to remove. This can save several instructions.
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20181006091816.7659-1-liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add handling of POST_MESSAGE hypercall. For that, add an interface to
regsiter a handler for the messages arrived from the guest on a
particular connection id (IOW set up a message connection in Hyper-V
speak).
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-10-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add handling of SIGNAL_EVENT hypercall. For that, provide an interface
to associate an EventNotifier with an event connection number, so that
it's signaled when the SIGNAL_EVENT hypercall with the matching
connection ID is called by the guest.
Support for using KVM functionality for this will be added in a followup
patch.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-8-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Per Hyper-V spec, SynIC message and event flag pages are to be
implemented as so called overlay pages. That is, they are owned by the
hypervisor and, when mapped into the guest physical address space,
overlay the guest physical pages such that
1) the overlaid guest page becomes invisible to the guest CPUs until the
overlay page is turned off
2) the contents of the overlay page is preserved when it's turned off
and back on, even at a different address; it's only zeroed at vcpu
reset
This particular nature of SynIC message and event flag pages is ignored
in the current code, and guest physical pages are used directly instead.
This happens to (mostly) work because the actual guests seem not to
depend on the features listed above.
This patch implements those pages as the spec mandates.
Since the extra RAM regions, which introduce migration incompatibility,
are only added at SynIC object creation which only happens when
hyperv_synic_kvm_only == false, no extra compat logic is necessary.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-5-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Certain configurations do not allow SynIC to be used in QEMU. In
particular,
- when hyperv_vpindex is off, SINT routes can't be used as they refer to
the destination vCPU by vp_index
- older KVM (which doesn't expose KVM_CAP_HYPERV_SYNIC2) zeroes out
SynIC message and event pages on every msr load, breaking migration
OTOH in-KVM users of SynIC -- SynIC timers -- do work in those
configurations, and we shouldn't stop the guest from using them.
To cover both scenarios, introduce an X86CPU property that makes CPU
init code to skip creation of the SynIC object (and thus disables any
SynIC use in QEMU) but keeps the KVM part of the SynIC working.
The property is clear by default but is set via compat logic for older
machine types.
As a result, when hv_synic and a modern machine type are specified, QEMU
will refuse to run unless vp_index is on and the kernel is recent
enough. OTOH with an older machine type QEMU will run fine with
hv_synic=on against an older kernel and/or without vp_index enabled but
will disallow the in-QEMU uses of SynIC (in e.g. VMBus).
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-4-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make Hyper-V SynIC a device which is attached as a child to a CPU. For
now it only makes SynIC visibile in the qom hierarchy, and maintains its
internal fields in sync with the respecitve msrs of the parent cpu (the
fields will be used in followup patches).
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-3-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Put a bit more consistency into handling KVM_CAP_HYPERV_SYNIC capability,
by checking its availability and determining the feasibility of hv-synic
property first, and enabling it later.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082217.29481-2-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will allow to build slightly leaner QEMU that supports some HyperV
features of KVM (e.g. SynIC timers, PV spinlocks, APIC assists, etc.)
but nothing else on the QEMU side.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082041.29380-6-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A significant part of hyperv.c is not actually tied to x86, and can
be moved to hw/.
This will allow to maintain most of Hyper-V and VMBus
target-independent, and to avoid conflicts with inclusion of
arch-specific headers down the road in VMBus implementation.
Also this stuff can now be opt-out with CONFIG_HYPERV.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082041.29380-4-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Also make the inverse function, hyperv_find_vcpu, static as it's not
used outside hyperv.c
This paves the way to making hyperv.c built optionally.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082041.29380-3-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some parts of the Hyper-V hypervisor-guest interface appear to be
target-independent, so move them into a proper header.
Not that Hyper-V ARM64 emulation is around the corner but it seems more
conveninent to have most of Hyper-V and VMBus target-independent, and
allows to avoid conflicts with inclusion of arch-specific headers down
the road in VMBus implementation.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921082041.29380-2-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There's nothing kvm-specific in it so follow the suite and replace
"kvm_hv" prefix with "hyperv".
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-9-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Multiple entities (e.g. VMBus devices) can use the same SINT route. To
make their lives easier in maintaining SINT route ownership, make it
reference-counted. Adjust the respective API names accordingly.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-8-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use X86CPU pointer to refer to the respective HvSintRoute instead of
vp_index. This is more convenient and also paves the way for future
enhancements.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-7-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make sint ack callback accept an opaque pointer, that is stored on
sint_route at creation time.
This allows for more convenient interaction with the callback.
Besides, nothing outside hyperv.c should need to know the layout of
HvSintRoute fields any more so its declaration can be removed from the
header.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-6-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There's no point setting up an sint ack notifier if no callback is
specified.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-5-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-4-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
accel_init_machine sets *(acc->allowed) to true if acc->init_machine(ms)
succeeds. There's no need to have both hvf_allowed and hvf_disabled.
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20181018143051.48508-1-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
According to Intel(R)64 and IA-32 Architectures Software Developer's
Manual, the following one-byte registers should be fetched when REX
prefix is present (sorted by reg encoding index):
AL, CL, DL, BL, SPL, BPL, SIL, DIL, R8L - R15L
The first 8 are fetched if REX.R is zero, the last 8 if non-zero.
The following registers should be fetched for instructions without REX
prefix (also sorted by reg encoding index):
AL, CL, DL, BL, AH, CH, DH, BH
Current emulation code doesn't handle accesses to SPL, BPL, SIL, DIL
when REX is present, thefore an instruction 40883e "mov %dil,(%rsi)" is
decoded as "mov %bh,(%rsi)".
That caused an infinite loop in vp_reset:
https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg03293.html
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20181018134401.44471-1-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hyper-V PV IPI support is merged to KVM, enable the feature in Qemu. When
enabled, this allows Windows guests to send IPIs to other vCPUs with a
single hypercall even when there are >64 vCPUs in the request.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20181009130853.6412-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The exception.pad field is going to be renamed to pending in an upcoming
header file update. Remove the unnecessary initialization; it was
introduced to please valgrind (commit 7e680753cf) but they were later
rendered unnecessary by commit 076796f8fd, which added the "= {}"
initializer to the declaration of "events". Therefore the patch does
not change behavior in any way.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This patch fixes the checking of boundary crossing instructions.
In icount mode only first instruction of the block may cross
the page boundary to keep the translation deterministic.
These conditions already existed, but compared the wrong variable.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <20180920071702.22477.43980.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While at it, also rename var to indicate it is not used only in KVM.
Reviewed-by: Nikita Leshchenko <nikita.leshchenko@oracle.com>
Reviewed-by: Patrick Colp <patrick.colp@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20180914003827.124570-2-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This flag will be used for KVM's nested VMX migration; the HF_GUEST_MASK name
is already used in KVM, adopt it in QEMU as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Interrupt handling depends on various flags in env->hflags or env->hflags2,
and the exact detail were not exactly replicated between x86_cpu_has_work
and x86_cpu_exec_interrupt. Create a new function that extracts the
highest-priority non-masked interrupt, and use it in both functions.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
And convert it to a bool to use an existing hole
in the struct.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The AMD IOMMU does not (yet) support interrupt remapping. But
kvm_arch_fixup_msi_route assumes that all implementations do and crashes
when the AMD IOMMU is used in KVM mode.
Fixes: 8b5ed7dffa ("intel_iommu: add support for split irqchip")
Reported-by: Christopher Goldsworthy <christopher.goldsworthy@outlook.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <48ae78d8-58ec-8813-8680-6f407ea46041@siemens.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The capability macros are always defined, since they come from kernel
headers that are copied into the QEMU tree. Remove the unnecessary #ifdefs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current implementation has three bugs,
* segment limits are not enforced in protected mode if the L bit is set
in the target segment descriptor
* segment limits are not enforced in compatibility mode (ljmp to 32-bit
code segment in long mode)
* #GP(new_cs) is generated rather than #GP(0)
Now the segment limits are enforced if we're not in long mode OR the
target code segment doesn't have the L bit set.
Signed-off-by: Andrew Oates <aoates@google.com>
Message-Id: <20180816011903.39816-1-andrew@andrewoates.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently call gates are always treated as 32-bit gates. In IA-32e mode
(either compatibility or 64-bit submode), system segment descriptors are
always 64-bit. Treating them as 32-bit has the expected unfortunate
effect: only the lower 32 bits of the offset are loaded, the stack
pointer is truncated, a bad new stack pointer is loaded from the TSS (if
switching privilege levels), etc.
This change adds support for 64-bit call gate to the lcall and ljmp
instructions. Additionally, there should be a check for non-canonical
stack pointers, but I've omitted that since there doesn't seem to be
checks for non-canonical addresses in this code elsewhere.
I've left the raise_exception_err_ra lines unwapped at 80 columns to
match the style in the rest of the file.
Signed-off-by: Andrew Oates <aoates@google.com>
Message-Id: <20180819181725.34098-1-andrew@andrewoates.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported by Coverity:
Error: RESOURCE_LEAK (CWE-772): [#def439]
qemu-2.12.0/target/i386/cpu.c:3179: alloc_fn: Storage is returned from allocation function "qdict_new".
qemu-2.12.0/qobject/qdict.c:34:5: alloc_fn: Storage is returned from allocation function "g_malloc0".
qemu-2.12.0/qobject/qdict.c:34:5: var_assign: Assigning: "qdict" = "g_malloc0(4120UL)".
qemu-2.12.0/qobject/qdict.c:37:5: return_alloc: Returning allocated memory "qdict".
qemu-2.12.0/target/i386/cpu.c:3179: var_assign: Assigning: "props" = storage returned from "qdict_new()".
qemu-2.12.0/target/i386/cpu.c:3217: leaked_storage: Variable "props" going out of scope leaks the storage it points to.
This was introduced by commit b8097deb35 ("i386: Improve
query-cpu-model-expansion full mode").
The leak is only theoretical: if ret->model->props is set to
props, the qapi_free_CpuModelExpansionInfo() call will free props
too in case of errors. The only way for this to not happen is if
we enter the default branch of the switch statement, which would
never happen because all CpuModelExpansionType values are being
handled.
It's still worth to change this to make the allocation logic
easier to follow and make the Coverity error go away. To make
everything simpler, initialize ret->model and ret->model->props
earlier in the function.
While at it, remove redundant check for !prop because prop is
always initialized at the beginning of the function.
Fixes: b8097deb35
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180816183509.8231-1-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Many of these are marked as "intentional/fix required" because they
just need adding a fall through comment. This is exactly what this
patch does, except for target/mips/translate.c where it is easier to
duplicate the code, and hw/audio/sb16.c where I consulted the DOSBox
sources and decide to just remove the LOG_UNIMP before the fallthrough.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Enabling TOPOEXT is always allowed, but it can't be enabled
blindly by "-cpu host" because it may make guests crash if the
rest of the cache topology information isn't provided or isn't
consistent.
This addresses the bug reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1613277
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180809221852.15285-1-ehabkost@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
New CPU models mostly inherit features from ancestor Skylake, while addin new
features: UMIP, New Instructions ( PCONIFIG (server only), WBNOINVD,
AVX512_VBMI2, GFNI, AVX512_VNNI, VPCLMULQDQ, VAES, AVX512_BITALG),
Intel PT and 5-level paging (Server only). As well as
IA32_PRED_CMD, SSBD support for speculative execution
side channel mitigations.
Note:
For 5-level paging, Guest physical address width can be configured, with
parameter "phys-bits". Unless explicitly specified, we still use its default
value, even for Icelake-Server cpu model.
At present, hold on expose IA32_ARCH_CAPABILITIES to guest, as 1) This MSR
actually presents more than 1 'feature', maintainers are considering expanding current
features presentation of only CPUIDs to MSR bits; 2) a reasonable default value
for MSR_IA32_ARCH_CAPABILITIES needs to settled first. These 2 are actully
beyond Icelake CPU model itself but fundamental. So split these work apart
and do it later.
https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00774.htmlhttps://lists.gnu.org/archive/html/qemu-devel/2018-07/msg00796.html
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1530781798-183214-6-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
WBNOINVD: Write back and do not invalidate cache, enumerated by
CPUID.(EAX=80000008H, ECX=0):EBX[bit 9].
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1530781798-183214-5-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Support of IA32_PRED_CMD MSR already be enumerated by same CPUID bit as
SPEC_CTRL.
At present, mark CPUID_7_0_EDX_ARCH_CAPABILITIES unmigratable, per Paolo's
comment.
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1530781798-183214-3-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
IA32_PRED_CMD MSR gives software a way to issue commands that affect the state
of indirect branch predictors. Enumerated by CPUID.(EAX=7H,ECX=0):EDX[26].
IA32_ARCH_CAPABILITIES MSR enumerates architectural features of RDCL_NO and
IBRS_ALL. Enumerated by CPUID.(EAX=07H, ECX=0):EDX[29].
https://software.intel.com/sites/default/files/managed/c5/63/336996-Speculative-Execution-Side-Channel-Mitigations.pdf
Signed-off-by: Robert Hoo <robert.hu@linux.intel.com>
Message-Id: <1530781798-183214-2-git-send-email-robert.hu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
MSR_SMI_COUNT started being migrated in QEMU 2.12. Do not migrate it
on older machine types, or the subsection causes a load failure for
guests that use SMM.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rename DCACHE to DATA_CACHE and ICACHE to INSTRUCTION_CACHE.
This avoids conflict with Linux asm/cachectl.h macros and fixes
build failure on mips hosts.
Reported-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180717194010.30096-1-ehabkost@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Aleksandar Markovic <amarkovic@wavecomp.com>
Reviewed-by: Babu Moger <babu.moger@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Hyper-V identifies vCPUs by Virtual Processor (VP) index which can be
queried by the guest via HV_X64_MSR_VP_INDEX msr. It is defined by the
spec as a sequential number which can't exceed the maximum number of
vCPUs per VM.
It has to be owned by QEMU in order to preserve it across migration.
However, the initial implementation in KVM didn't allow to set this
msr, and KVM used its own notion of VP index. Fortunately, the way
vCPUs are created in QEMU/KVM makes it likely that the KVM value is
equal to QEMU cpu_index.
So choose cpu_index as the value for vp_index, and push that to KVM on
kernels that support setting the msr. On older ones that don't, query
the kernel value and assert that it's in sync with QEMU.
Besides, since handling errors from vCPU init at hotplug time is
impossible, disable vCPU hotplug.
This patch also introduces accessor functions to encapsulate the mapping
between a vCPU and its vp_index.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180702134156.13404-3-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In Hyper-V-related code, vCPUs are identified by their VP (virtual
processor) index. Since it's customary for "vcpu_id" in QEMU to mean
APIC id, rename the respective variables to "vp_index" to make the
distinction clear.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180702134156.13404-2-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds field with content of KERNEL_GS_BASE MSR to QEMU note in
ELF dump.
On Windows, if all vCPUs are running usermode tasks at the time the dump is
created, this can be helpful in the discovery of guest system structures
during conversion ELF dump to MEMORY.DMP dump.
Signed-off-by: Viktor Prutyanov <viktor.prutyanov@virtuozzo.com>
Message-Id: <20180714123000.11326-1-viktor.prutyanov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since commit d6dcc5583e, '-cpu ?' shows the description of the
X86_CPU_TYPE_NAME("max") for the host CPU model:
Enables all features supported by the accelerator in the current host
instead of the expected:
KVM processor with all supported host features
or
HVF processor with all supported host features
This is caused by the early use of kvm_enabled() and hvf_enabled() in
a class_init function. Since the accelerator isn't configured yet, both
helpers return false unconditionally.
A QEMU binary will only be compiled with one of these accelerators, not
both. The appropriate description can thus be decided at build time.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <153055056654.212317.4697363278304826913.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Correct the output of the "info mem" and "info tlb" monitor commands to
correctly show canonical addresses.
In 48-bit addressing mode, the upper 16 bits of linear addresses are
equal to bit 47. In 57-bit addressing mode (LA57), the upper 7 bits of
linear addresses are equal to bit 56.
Signed-off-by: Doug Gale <doug16k@gmail.com>
Message-Id: <20180617084025.29198-1-doug16k@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This implements NPT suport for SVM by hooking into
x86_cpu_handle_mmu_fault where it reads the stage-1 page table. Whether
we need to perform this 2nd stage translation, and how, is decided
during vmrun and stored in hflags2, along with nested_cr3 and
nested_pg_mode.
As get_hphys performs a direct cpu_vmexit in case of NPT faults, we need
retaddr in that function. To avoid changing the signature of
cpu_handle_mmu_fault, this passes the value from tlb_fill to get_hphys
via the CPU state.
This was tested successfully via the Jailhouse hypervisor.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <567473a0-6005-5843-4c73-951f476085ca@web.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add support for Hyper-V TLB flush which recently got added to KVM.
Just like regular Hyper-V we announce HV_EX_PROCESSOR_MASKS_RECOMMENDED
regardless of how many vCPUs we have. Windows is 'smart' and uses less
expensive non-EX Hypercall whenever possible (when it wants to flush TLB
for all vCPUs or the maximum vCPU index in the vCPU set requires flushing
is less than 64).
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20180610184927.19309-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When guest CPU PM is enabled, and with -cpu host, expose the host CPU
MWAIT leaf in the CPUID so guest can make good PM decisions.
Note: the result is 100% CPU utilization reported by host as host
no longer knows that the CPU is halted.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180622192148.178309-3-mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With this flag, kvm allows guest to control host CPU power state. This
increases latency for other processes using same host CPU in an
unpredictable way, but if decreases idle entry/exit times for the
running VCPU, so to use it QEMU needs a hint about whether host CPU is
overcommitted, hence the flag name.
Follow-up patches will expose this capability to guest
(using mwait leaf).
Based on a patch by Wanpeng Li <kernellwp@gmail.com> .
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20180622192148.178309-2-mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's start to use "info pic" just like other platforms. For now we
keep the command for a while so that old users can know what is the new
command to use.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171229073104.3810-6-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It calls cpu_loop_exit in system emulation mode (and should never be
called in user emulation mode).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <6f4d44ffde55d074cbceb48309c1678600abad2f.1522769774.git.jan.kiszka@web.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We need to terminate the translation block after STGI so that pending
interrupts can be injected.
This fixes pending NMI injection for Jailhouse which uses "stgi; clgi"
to open a brief injection window.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <37939b244dda0e9cccf96ce50f2b15df1e48315d.1522769774.git.jan.kiszka@web.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Check for SVM interception prior to injecting an NMI. Tested via the
Jailhouse hypervisor.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <c65877e9a011ee4962931287e59f502c482b8d0b.1522769774.git.jan.kiszka@web.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some variations of Linux kernels end up accessing MSR's that the Windows
Hypervisor doesn't implement which causes a GP to be returned to the guest.
This fix registers QEMU for unimplemented MSR access and globally returns 0 on
reads and ignores writes. This behavior is allows the Linux kernel to probe the
MSR with a write/read/check sequence it does often without failing the access.
Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <20180605221500.21674-2-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adds a workaround to an incorrect value setting
CPUID Fn8000_0001_ECX[bit 9 OSVW] = 1. This can cause a guest linux kernel
to panic when an issue to rdmsr C001_0140h returns 0. Disabling this feature
correctly allows the guest to boot without accessing the osv workarounds.
Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <20180605221500.21674-1-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The implementation of these two instructions was swapped.
At the same time, unify the setup of eflags for the insn group.
Reported-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20170712192902.15493-1-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Determining the size of a field is useful when you don't have a struct
variable handy. Open-coding this is ugly.
This patch adds the sizeof_field() macro, which is similar to
typeof_field(). Existing instances are updated to use the macro.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20180614164431.29305-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Remove generic non-intel check while validating hyperthreading support.
Certain AMD CPUs can support hyperthreading now.
CPU family with TOPOEXT feature can support hyperthreading now.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Tested-by: Geoffrey McRae <geoff@hostfission.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1529443919-67509-4-git-send-email-babu.moger@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Enable TOPOEXT feature on EPYC CPU. This is required to support
hyperthreading on VM guests. Also extend xlevel to 0x8000001E.
Disable topoext on PC_COMPAT_2_12 and keep xlevel 0x8000000a.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <1529443919-67509-3-git-send-email-babu.moger@amd.com>
[ehabkost: Added EPYC-IBPB.xlevel to PC_COMPAT_2_12]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This is part of topoext support. To keep the compatibility, it is better
we support all the combination of nr_cores and nr_threads currently
supported. By allowing more nr_cores and nr_threads, we might end up with
more nodes than we can actually support with the real hardware. We need to
fix up the node id to make this work. We can achieve this by shifting the
socket_id bits left to address more nodes.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <1529443919-67509-2-git-send-email-babu.moger@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Enabling TOPOEXT feature might cause compatibility issues if
older kernels does not set this feature. Lets set this feature
unconditionally.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <1528939107-17193-2-git-send-email-babu.moger@amd.com>
[ehabkost: rewrite comment and commit message]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
AMD future CPUs expose a mechanism to tell the guest that the
Speculative Store Bypass Disable is not needed and that the
CPU is all good.
This is exposed via the CPUID 8000_0008.EBX[26] bit.
See 124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
A copy of this document is available at
https://bugzilla.kernel.org/show_bug.cgi?id=199889
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Message-Id: <20180601153809.15259-3-konrad.wilk@oracle.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
AMD future CPUs expose _two_ ways to utilize the Intel equivalant
of the Speculative Store Bypass Disable. The first is via
the virtualized VIRT_SPEC CTRL MSR (0xC001_011f) and the second
is via the SPEC_CTRL MSR (0x48). The document titled:
124441_AMD64_SpeculativeStoreBypassDisable_Whitepaper_final.pdf
gives priority of SPEC CTRL MSR over the VIRT SPEC CTRL MSR.
A copy of this document is available at
https://bugzilla.kernel.org/show_bug.cgi?id=199889
Anyhow, this means that on future AMD CPUs there will be _two_ ways to
deal with SSBD.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Message-Id: <20180601153809.15259-2-konrad.wilk@oracle.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
OSPKE is not a static feature flag: it changes dynamically at
runtime depending on CR4, and it was never configurable: KVM
never returned OSPKE on GET_SUPPORTED_CPUID, and on TCG enables
it automatically if CR4_PKE_MASK is set.
Remove OSPKE from the feature name array so users don't try to
configure it manually.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180611203712.12086-1-ehabkost@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
OSXAVE is not a static feature flag: it changes dynamically at
runtime depending on CR4, and it was never configurable: KVM
never returned OSXSAVE on GET_SUPPORTED_CPUID, and it is not
included in TCG_EXT_FEATURES.
Remove OSXSAVE from the feature name array so users don't try to
configure it manually.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180611203855.13269-1-ehabkost@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When using '-cpu help' the list of CPUID features is grouped according
to the internal low level CPUID grouping. The data printed results in
very long lines too.
This combines to make it hard for users to read the output and identify
if QEMU knows about the feature they wish to use.
This change gets rid of the grouping of features and treats all flags as
single list. The list is sorted into alphabetical order and the printing
with line wrapping at the 77th column.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180606165527.17365-4-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The current list of CPU model names output by "-cpu help" is sorted
alphabetically based on the internal QOM class name. The text that is
displayed, however, uses the CPU model name, which is equivalent to the
QOM class name, minus a suffix. Unfortunately that suffix has an effect
on the sort ordering, for example, causing the various Broadwell
variants to appear reversed:
x86 486
x86 Broadwell-IBRS Intel Core Processor (Broadwell, IBRS)
x86 Broadwell-noTSX-IBRS Intel Core Processor (Broadwell, no TSX, IBRS
x86 Broadwell-noTSX Intel Core Processor (Broadwell, no TSX)
x86 Broadwell Intel Core Processor (Broadwell)
x86 Conroe Intel Celeron_4x0 (Conroe/Merom Class Core 2)
By sorting on the actual CPU model name text that is displayed, the
result is
x86 486
x86 Broadwell Intel Core Processor (Broadwell)
x86 Broadwell-IBRS Intel Core Processor (Broadwell, IBRS)
x86 Broadwell-noTSX Intel Core Processor (Broadwell, no TSX)
x86 Broadwell-noTSX-IBRS Intel Core Processor (Broadwell, no TSX, IBRS)
x86 Conroe Intel Celeron_4x0 (Conroe/Merom Class Core 2)
This requires extra string allocations during sorting, but this is not a
concern given the usage scenario and the number of CPU models that exist.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180606165527.17365-3-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Since the addition of the -IBRS CPU model variants, the descriptions
shown by '-cpu help' are not well aligned, as several model names
overflow the space allowed. Right aligning the CPU model names is also
not attractive, because it obscures the common name prefixes of many
models. The CPU model name field needs to be 4 characters larger, and
be left aligned instead.
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180606165527.17365-2-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add support for cpuid leaf CPUID_8000_001E. Build the config that closely
match the underlying hardware. Please refer to the Processor Programming
Reference (PPR) for AMD Family 17h Model for more details.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <1528498581-131037-2-git-send-email-babu.moger@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add information for cpuid 0x8000001D leaf. Populate cache topology information
for different cache types (Data Cache, Instruction Cache, L2 and L3) supported
by 0x8000001D leaf. Please refer to the Processor Programming Reference (PPR)
for AMD Family 17h Model for more details.
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <1527176614-26271-3-git-send-email-babu.moger@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Always initialize CPUCaches structs with cache information, even
if legacy_cache=true. Use different CPUCaches struct for
CPUID[2], CPUID[4], and the AMD CPUID leaves.
This will simplify a lot the logic inside cpu_x86_cpuid().
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <1527176614-26271-2-git-send-email-babu.moger@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Do the cast to uintptr_t within the helper, so that the compiler
can type check the pointer argument. We can also do some more
sanity checking of the index argument.
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now we've updated our copy of the kernel headers we can remove the
compatibility shim that handled KVM_HINTS_REALTIME not being defined.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180525132755.21839-7-peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>