Commit Graph

337 Commits

Author SHA1 Message Date
Cathy Zhang 5dd13f2a5b target/i386: Add SERIALIZE cpu feature
The availability of the SERIALIZATION instruction is indicated
by the presence of the CPUID feature flag SERIALIZE, which is
defined as CPUID.(EAX=7,ECX=0):ECX[bit 14].

The release spec link is as follows:
https://software.intel.com/content/dam/develop/public/us/en/documents/\
architecture-instruction-set-extensions-programming-reference.pdf

The associated kvm patch link is as follows:
https://lore.kernel.org/patchwork/patch/1268025/

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Message-Id: <1593991036-12183-2-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 19:26:54 -04:00
Luwei Kang cbe0dad190 target/i386: Correct the warning message of Intel PT
The CPUID level need to be set to 0x14 manually on old
machine-type if Intel PT is enabled in guest. E.g. the
CPUID[0].EAX(level)=7 and CPUID[7].EBX[25](intel-pt)=1 when the
Qemu with "-machine pc-i440fx-3.1 -cpu qemu64,+intel-pt" parameter.

This patch corrects the warning message of the previous
submission(ddc2fc9).

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1593499113-4768-1-git-send-email-luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 18:02:24 -04:00
Paolo Bonzini e1e43813e7 KVM: x86: believe what KVM says about WAITPKG
Currently, QEMU is overriding KVM_GET_SUPPORTED_CPUID's answer for
the WAITPKG bit depending on the "-overcommit cpu-pm" setting.  This is a
bad idea because it does not even check if the host supports it, but it
can be done in x86_cpu_realizefn just like we do for the MONITOR bit.

This patch moves it there, while making it conditional on host
support for the related UMWAIT MSR.

Cc: qemu-stable@nongnu.org
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 18:02:22 -04:00
Roman Bolshakov 5009ef22c6 i386: hvf: Don't duplicate register reset
hvf_reset_vcpu() duplicates actions performed by x86_cpu_reset(). The
difference is that hvf_reset_vcpu() stores initial values directly to
VMCS while x86_cpu_reset() stores it in CPUX86State and then
cpu_synchronize_all_post_init() or cpu_synchronize_all_post_reset()
flushes CPUX86State into VMCS. That makes hvf_reset_vcpu() a kind of
no-op.

Here's the trace of CPU state modifications during VM start:
  hvf_reset_vcpu (resets VMCS)
  cpu_synchronize_all_post_init (overwrites VMCS fields written by
                                 hvf_reset_vcpu())
  cpu_synchronize_all_states
  hvf_reset_vcpu (resets VMCS)
  cpu_synchronize_all_post_reset (overwrites VMCS fields written by
                                  hvf_reset_vcpu())

General purpose registers, system registers, segment descriptors, flags
and IP are set by hvf_put_segments() in post-init and post-reset,
therefore it's safe to remove them from hvf_reset_vcpu().

PDPTE initialization can be dropped because Intel SDM (26.3.1.6 Checks
on Guest Page-Directory-Pointer-Table Entries) doesn't require PDPTE to
be clear unless PAE is used: "A VM entry to a guest that does not use
PAE paging does not check the validity of any PDPTEs."
And if PAE is used, PDPTE's are initialized from CR3 in macvm_set_cr0().

Cc: Cameron Esfahani <dirty@apple.com>
Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20200630102824.77604-8-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 18:02:19 -04:00
Paolo Bonzini b16c0e20c7 KVM: add support for AMD nested live migration
Support for nested guest live migration is part of Linux 5.8, add the
corresponding code to QEMU.  The migration format consists of a few
flags, is an opaque 4k blob.

The blob is in VMCB format (the control area represents the L1 VMCB
control fields, the save area represents the pre-vmentry state; KVM does
not use the host save area since the AMD manual allows that) but QEMU
does not really care about that.  However, the flags need to be
copied to hflags/hflags2 and back.

In addition, support for retrieving and setting the AMD nested virtualization
states allows the L1 guest to be reset while running a nested guest, but
a small bug in CPU reset needs to be fixed for that to work.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 18:02:17 -04:00
Markus Armbruster 992861fb1e error: Eliminate error_propagate() manually
When all we do with an Error we receive into a local variable is
propagating to somewhere else, we can just as well receive it there
right away.  The previous two commits did that for sufficiently simple
cases with Coccinelle.  Do it for several more manually.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-37-armbru@redhat.com>
2020-07-10 15:18:08 +02:00
Markus Armbruster 668f62ec62 error: Eliminate error_propagate() with Coccinelle, part 1
When all we do with an Error we receive into a local variable is
propagating to somewhere else, we can just as well receive it there
right away.  Convert

    if (!foo(..., &err)) {
        ...
        error_propagate(errp, err);
        ...
        return ...
    }

to

    if (!foo(..., errp)) {
        ...
        ...
        return ...
    }

where nothing else needs @err.  Coccinelle script:

    @rule1 forall@
    identifier fun, err, errp, lbl;
    expression list args, args2;
    binary operator op;
    constant c1, c2;
    symbol false;
    @@
         if (
    (
    -        fun(args, &err, args2)
    +        fun(args, errp, args2)
    |
    -        !fun(args, &err, args2)
    +        !fun(args, errp, args2)
    |
    -        fun(args, &err, args2) op c1
    +        fun(args, errp, args2) op c1
    )
            )
         {
             ... when != err
                 when != lbl:
                 when strict
    -        error_propagate(errp, err);
             ... when != err
    (
             return;
    |
             return c2;
    |
             return false;
    )
         }

    @rule2 forall@
    identifier fun, err, errp, lbl;
    expression list args, args2;
    expression var;
    binary operator op;
    constant c1, c2;
    symbol false;
    @@
    -    var = fun(args, &err, args2);
    +    var = fun(args, errp, args2);
         ... when != err
         if (
    (
             var
    |
             !var
    |
             var op c1
    )
            )
         {
             ... when != err
                 when != lbl:
                 when strict
    -        error_propagate(errp, err);
             ... when != err
    (
             return;
    |
             return c2;
    |
             return false;
    |
             return var;
    )
         }

    @depends on rule1 || rule2@
    identifier err;
    @@
    -    Error *err = NULL;
         ... when != err

Not exactly elegant, I'm afraid.

The "when != lbl:" is necessary to avoid transforming

         if (fun(args, &err)) {
             goto out
         }
         ...
     out:
         error_propagate(errp, err);

even though other paths to label out still need the error_propagate().
For an actual example, see sclp_realize().

Without the "when strict", Coccinelle transforms vfio_msix_setup(),
incorrectly.  I don't know what exactly "when strict" does, only that
it helps here.

The match of return is narrower than what I want, but I can't figure
out how to express "return where the operand doesn't use @err".  For
an example where it's too narrow, see vfio_intx_enable().

Silently fails to convert hw/arm/armsse.c, because Coccinelle gets
confused by ARMSSE being used both as typedef and function-like macro
there.  Converted manually.

Line breaks tidied up manually.  One nested declaration of @local_err
deleted manually.  Preexisting unwanted blank line dropped in
hw/riscv/sifive_e.c.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-35-armbru@redhat.com>
2020-07-10 15:18:08 +02:00
Markus Armbruster 778a2dc592 qom: Use returned bool to check for failure, Coccinelle part
The previous commit enables conversion of

    foo(..., &err);
    if (err) {
        ...
    }

to

    if (!foo(..., errp)) {
        ...
    }

for QOM functions that now return true / false on success / error.
Coccinelle script:

    @@
    identifier fun = {
        object_apply_global_props, object_initialize_child_with_props,
        object_initialize_child_with_propsv, object_property_get,
        object_property_get_bool, object_property_parse, object_property_set,
        object_property_set_bool, object_property_set_int,
        object_property_set_link, object_property_set_qobject,
        object_property_set_str, object_property_set_uint, object_set_props,
        object_set_propv, user_creatable_add_dict,
        user_creatable_complete, user_creatable_del
    };
    expression list args, args2;
    typedef Error;
    Error *err;
    @@
    -    fun(args, &err, args2);
    -    if (err)
    +    if (!fun(args, &err, args2))
         {
             ...
         }

Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Convert manually.

Line breaks tidied up manually.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-29-armbru@redhat.com>
2020-07-10 15:18:08 +02:00
Markus Armbruster 5325cc34a2 qom: Put name parameter before value / visitor parameter
The object_property_set_FOO() setters take property name and value in
an unusual order:

    void object_property_set_FOO(Object *obj, FOO_TYPE value,
                                 const char *name, Error **errp)

Having to pass value before name feels grating.  Swap them.

Same for object_property_set(), object_property_get(), and
object_property_parse().

Convert callers with this Coccinelle script:

    @@
    identifier fun = {
        object_property_get, object_property_parse, object_property_set_str,
        object_property_set_link, object_property_set_bool,
        object_property_set_int, object_property_set_uint, object_property_set,
        object_property_set_qobject
    };
    expression obj, v, name, errp;
    @@
    -    fun(obj, v, name, errp)
    +    fun(obj, name, v, errp)

Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error
message "no position information".  Convert that one manually.

Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Convert manually.

Fails to convert hw/rx/rx-gdbsim.c, because Coccinelle gets confused
by RXCPU being used both as typedef and function-like macro there.
Convert manually.  The other files using RXCPU that way don't need
conversion.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-27-armbru@redhat.com>
[Straightforwad conflict with commit 2336172d9b "audio: set default
value for pcspk.iobase property" resolved]
2020-07-10 15:18:08 +02:00
Markus Armbruster 62a35aaa31 qapi: Use returned bool to check for failure, Coccinelle part
The previous commit enables conversion of

    visit_foo(..., &err);
    if (err) {
        ...
    }

to

    if (!visit_foo(..., errp)) {
        ...
    }

for visitor functions that now return true / false on success / error.
Coccinelle script:

    @@
    identifier fun =~ "check_list|input_type_enum|lv_start_struct|lv_type_bool|lv_type_int64|lv_type_str|lv_type_uint64|output_type_enum|parse_type_bool|parse_type_int64|parse_type_null|parse_type_number|parse_type_size|parse_type_str|parse_type_uint64|print_type_bool|print_type_int64|print_type_null|print_type_number|print_type_size|print_type_str|print_type_uint64|qapi_clone_start_alternate|qapi_clone_start_list|qapi_clone_start_struct|qapi_clone_type_bool|qapi_clone_type_int64|qapi_clone_type_null|qapi_clone_type_number|qapi_clone_type_str|qapi_clone_type_uint64|qapi_dealloc_start_list|qapi_dealloc_start_struct|qapi_dealloc_type_anything|qapi_dealloc_type_bool|qapi_dealloc_type_int64|qapi_dealloc_type_null|qapi_dealloc_type_number|qapi_dealloc_type_str|qapi_dealloc_type_uint64|qobject_input_check_list|qobject_input_check_struct|qobject_input_start_alternate|qobject_input_start_list|qobject_input_start_struct|qobject_input_type_any|qobject_input_type_bool|qobject_input_type_bool_keyval|qobject_input_type_int64|qobject_input_type_int64_keyval|qobject_input_type_null|qobject_input_type_number|qobject_input_type_number_keyval|qobject_input_type_size_keyval|qobject_input_type_str|qobject_input_type_str_keyval|qobject_input_type_uint64|qobject_input_type_uint64_keyval|qobject_output_start_list|qobject_output_start_struct|qobject_output_type_any|qobject_output_type_bool|qobject_output_type_int64|qobject_output_type_null|qobject_output_type_number|qobject_output_type_str|qobject_output_type_uint64|start_list|visit_check_list|visit_check_struct|visit_start_alternate|visit_start_list|visit_start_struct|visit_type_.*";
    expression list args;
    typedef Error;
    Error *err;
    @@
    -    fun(args, &err);
    -    if (err)
    +    if (!fun(args, &err))
         {
             ...
         }

A few line breaks tidied up manually.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-19-armbru@redhat.com>
2020-07-10 15:18:08 +02:00
Eduardo Habkost 730319aef0 i386: Mask SVM features if nested SVM is disabled
QEMU incorrectly validates FEAT_SVM feature flags against
GET_SUPPORTED_CPUID even if SVM features are being masked out by
cpu_x86_cpuid().  This can make QEMU print warnings on most AMD
CPU models, even when SVM nesting is disabled (which is the
default).

This bug was never detected before because of a Linux KVM bug:
until Linux v5.6, KVM was not filtering out SVM features in
GET_SUPPORTED_CPUID when nested was disabled.  This KVM bug was
fixed in Linux v5.7-rc1, on Linux commit a50718cc3f43 ("KVM:
nSVM: Expose SVM features to L1 iff nested is enabled").

Fix the problem by adding a CPUID_EXT3_SVM dependency to all
FEAT_SVM feature flags in the feature_dependencies table.

Reported-by: Yanan Fu <yfu@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20200623230116.277409-1-ehabkost@redhat.com>
[Fix testcase. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:40 -04:00
Tao Xu 47f0d11d21 target/i386: Add notes for versioned CPU models
Add which features are added or removed in this version.

Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20200324051034.30541-1-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-26 09:39:39 -04:00
Markus Armbruster ce189ab230 qdev: Convert bus-less devices to qdev_realize() with Coccinelle
All remaining conversions to qdev_realize() are for bus-less devices.
Coccinelle script:

    // only correct for bus-less @dev!

    @@
    expression errp;
    expression dev;
    @@
    -    qdev_init_nofail(dev);
    +    qdev_realize(dev, NULL, &error_fatal);

    @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@
    expression errp;
    expression dev;
    symbol true;
    @@
    -    object_property_set_bool(OBJECT(dev), true, "realized", errp);
    +    qdev_realize(DEVICE(dev), NULL, errp);

    @ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@
    expression errp;
    expression dev;
    symbol true;
    @@
    -    object_property_set_bool(dev, true, "realized", errp);
    +    qdev_realize(DEVICE(dev), NULL, errp);

Note that Coccinelle chokes on ARMSSE typedef vs. macro in
hw/arm/armsse.c.  Worked around by temporarily renaming the macro for
the spatch run.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200610053247.1583243-57-armbru@redhat.com>
2020-06-15 22:06:04 +02:00
Peter Maydell 7d3660e798 * Miscellaneous fixes and feature enablement (many)
* SEV refactoring (David)
 * Hyper-V initial support (Jon)
 * i386 TCG fixes (x87 and SSE, Joseph)
 * vmport cleanup and improvements (Philippe, Liran)
 * Use-after-free with vCPU hot-unplug (Nengyuan)
 * run-coverity-scan improvements (myself)
 * Record/replay fixes (Pavel)
 * -machine kernel_irqchip=split improvements for INTx (Peter)
 * Code cleanups (Philippe)
 * Crash and security fixes (PJP)
 * HVF cleanups (Roman)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl7jpdAUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMfjwf/X7+0euuE9dwKFKDDMmIi+4lRWnq7
 gSOyE1BYSfDIUXRIukf64konXe0VpiotNYlyEaYnnQjkMdGm5E9iXKF+LgEwXj/t
 NSGkfj5J3VeWRG4JJp642CSN/aZWO8uzkenld3myCnu6TicuN351tDJchiFwAk9f
 wsXtgLKd67zE8MLVt8AP0rNTbzMHttPXnPaOXDCuwjMHNvMEKnC93UeOeM0M4H5s
 3Dl2HvsNWZ2SzUG9mAbWp0bWWuoIb+Ep9//87HWANvb7Z8jratRws18i6tYt1sPx
 8zOnUS87sVnh1CQlXBDd9fEcqBUVgR9pAlqaaYavNhFp5eC31euvpDU8Iw==
 =F4sU
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Miscellaneous fixes and feature enablement (many)
* SEV refactoring (David)
* Hyper-V initial support (Jon)
* i386 TCG fixes (x87 and SSE, Joseph)
* vmport cleanup and improvements (Philippe, Liran)
* Use-after-free with vCPU hot-unplug (Nengyuan)
* run-coverity-scan improvements (myself)
* Record/replay fixes (Pavel)
* -machine kernel_irqchip=split improvements for INTx (Peter)
* Code cleanups (Philippe)
* Crash and security fixes (PJP)
* HVF cleanups (Roman)

# gpg: Signature made Fri 12 Jun 2020 16:57:04 BST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (116 commits)
  target/i386: Remove obsolete TODO file
  stubs: move Xen stubs to accel/
  replay: fix replay shutdown for console mode
  exec/cpu-common: Move MUSB specific typedefs to 'hw/usb/hcd-musb.h'
  hw/usb: Move device-specific declarations to new 'hcd-musb.h' header
  exec/memory: Remove unused MemoryRegionMmio type
  checkpatch: reversed logic with acpi test checks
  target/i386: sev: Unify SEVState and SevGuestState
  target/i386: sev: Remove redundant handle field
  target/i386: sev: Remove redundant policy field
  target/i386: sev: Remove redundant cbitpos and reduced_phys_bits fields
  target/i386: sev: Partial cleanup to sev_state global
  target/i386: sev: Embed SEVState in SevGuestState
  target/i386: sev: Rename QSevGuestInfo
  target/i386: sev: Move local structure definitions into .c file
  target/i386: sev: Remove unused QSevGuestInfoClass
  xen: fix build without pci passthrough
  i386: hvf: Drop HVFX86EmulatorState
  i386: hvf: Move mmio_buf into CPUX86State
  i386: hvf: Move lazy_flags into CPUX86State
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	hw/i386/acpi-build.c
2020-06-12 23:06:22 +01:00
Like Xu ea39f9b643 target/i386: define a new MSR based feature word - FEAT_PERF_CAPABILITIES
The Perfmon and Debug Capability MSR named IA32_PERF_CAPABILITIES is
a feature-enumerating MSR, which only enumerates the feature full-width
write (via bit 13) by now which indicates the processor supports IA32_A_PMCx
interface for updating bits 32 and above of IA32_PMCx.

The existence of MSR IA32_PERF_CAPABILITIES is enumerated by CPUID.1:ECX[15].

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: qemu-devel@nongnu.org
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20200529074347.124619-5-like.xu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-10 12:10:47 -04:00
Cathy Zhang 353f98c9ad x86/cpu: Enable AVX512_VP2INTERSECT cpu feature
AVX512_VP2INTERSECT compute vector pair intersection to a pair
of mask registers, which is introduced with intel Tiger Lake,
defining as CPUID.(EAX=7,ECX=0):EDX[bit 08].

Refer to the following release spec:
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Message-Id: <1586760758-13638-1-git-send-email-cathy.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-10 12:10:27 -04:00
Philippe Mathieu-Daudé da278d58a0 accel: Move Xen accelerator code under accel/xen/
This code is not related to hardware emulation.
Move it under accel/ with the other hypervisors.

Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200508100222.7112-1-philmd@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-10 12:09:56 -04:00
Babu Moger cac9edfc4d target/i386: Fix the CPUID leaf CPUID_Fn80000008
CPUID leaf CPUID_Fn80000008_ECX provides information about the
number of threads supported by the processor. It was found that
the field ApicIdSize(bits 15-12) was not set correctly.

ApicIdSize is defined as the number of bits required to represent
all the ApicId values within a package.

Valid Values: Value Description
3h-0h		Reserved.
4h		up to 16 threads.
5h		up to 32 threads.
6h		up to 64 threads.
7h		up to 128 threads.
Fh-8h		Reserved.

Fix the bit appropriately.

This came up during following thread.
https://lore.kernel.org/qemu-devel/158643709116.17430.15995069125716778943.malonedeb@wampee.canonical.com/#t

Refer the Processor Programming Reference (PPR) for AMD Family 17h
Model 01h, Revision B1 Processors. The documentation is available
from the bugzilla Link below.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537

Reported-by: Philipp Eppelt <1871842@bugs.launchpad.net>
Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <20200417215345.64800.73351.stgit@localhost.localdomain>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-10 12:09:42 -04:00
Philippe Mathieu-Daudé 3fb79344bd target/i386/cpu: Use the IEC binary prefix definitions
IEC binary prefixes ease code review: the unit is explicit.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20200601142930.29408-9-f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-06-09 19:58:53 +02:00
Peter Maydell 49ee115552 linux-user pull request 20200605-v2
Implement F_OFD_ fcntl() command, /proc/cpuinfo for hppa
 Fix socket(), prnctl() error codes, underflow in target_mremap,
     epoll_create() strace, oldumount for alpha
 User-mode build dependencies improvement
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAl7blvgSHGxhdXJlbnRA
 dml2aWVyLmV1AAoJEPMMOL0/L748Nf0P/1QF1Y4A2I/SI53TksSWS55wlWCfV/wd
 SXUSjTmM1W4Y/tKScwkjooClYeVV59Ie5VL7WMdLO0YGxTQC7jqBONHAuaxSb4ky
 qNI5pvW0fpfl4i1ThC7XIlihOn49WlzEczTZqLRMuOh28nr3gJQCWweo/QIQoyUl
 KNcCLgQiY3raBi7nykC26dRc8DvV6sSu+qcoTk8A0FRbEfUDf+sj/njY3Xh8AZN7
 FAn4iscV/UIAnGM6VGQzGfUHfBBL28rkmg/++oQrnvnH8blx0O1NrdDsgiHuKT7P
 /OC4tzpp6IkBzOy+sL3V/QdjKoxmMPadDj39rGLnZTQ6GZFXRZgVckknVdupTZD6
 77lmnvbQMKmsKJWwn8zrd3RtwG2L6tWHgm16ZUXXaU+lFDa/xn55o4KnMdgZXEGP
 +7EHf2IfkZfiFmblBWiJi7OMg2wzSDQaAIBTMr43nJfDwZUvKGnAHccuVLQitpGe
 4dRN6lCT0K1h6WwNhLRH/Fqqhi9vN7o3sSUQVm128XzYOOPDRyau/R4F1AQNbNdU
 +ZyrZqSvQxSxH0VNeu5wHRiwOym3bFJTVmGd5cWMzXs1kb+vmMG5ZjGBfxFr6gbC
 9bVwDMxJ9vU5ExIZfUg4J/dOtWccJuADj11QPDcm8et3Tbqy1iiV9Py2k5IsIgqM
 BCsqBl3i+ekz
 =5q4d
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-5.1-pull-request' into staging

linux-user pull request 20200605-v2

Implement F_OFD_ fcntl() command, /proc/cpuinfo for hppa
Fix socket(), prnctl() error codes, underflow in target_mremap,
    epoll_create() strace, oldumount for alpha
User-mode build dependencies improvement

# gpg: Signature made Sat 06 Jun 2020 14:15:36 BST
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-5.1-pull-request:
  stubs: Restrict ui/win32-kbd-hook to system-mode
  hw/core: Restrict CpuClass::get_crash_info() to system-mode
  target/s390x: Restrict CpuClass::get_crash_info() to system-mode
  target/i386: Restrict CpuClass::get_crash_info() to system-mode
  arch_init: Remove unused 'qapi-commands-misc.h' include
  exec: Assert CPU migration is not used on user-only build
  target/riscv/cpu: Restrict CPU migration to system-mode
  stubs/Makefile: Reduce the user-mode object list
  util/Makefile: Reduce the user-mode object list
  tests/Makefile: Restrict some softmmu-only tests
  tests/Makefile: Only display TCG-related tests when TCG is available
  configure: Avoid building TCG when not needed
  Makefile: Only build virtiofsd if system-mode is enabled
  linux-user: implement OFD locks
  linux-user/mmap.c: fix integer underflow in target_mremap
  linux-user/strace.list: fix epoll_create{,1} -strace output
  linux-user: Add support for /proc/cpuinfo on hppa platform
  linux-user: return target error codes for socket() and prctl()
  linux-user, alpha: fix oldumount syscall

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-06-08 11:04:57 +01:00
Philippe Mathieu-Daudé b75c990080 target/i386: Restrict CpuClass::get_crash_info() to system-mode
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200522172510.25784-11-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2020-06-05 21:23:22 +02:00
Markus Armbruster 49e2fa85ff i386: Fix x86_cpu_load_model() error API violation
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL.  Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.

x86_cpu_load_model() is wrong that way.  Harmless, because its @errp
is always &error_abort.  To fix, cut out the @errp middleman.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20200505101908.6207-11-armbru@redhat.com>
2020-05-27 07:45:45 +02:00
Markus Armbruster b69c3c21a5 qdev: Unrealize must not fail
Devices may have component devices and buses.

Device realization may fail.  Realization is recursive: a device's
realize() method realizes its components, and device_set_realized()
realizes its buses (which should in turn realize the devices on that
bus, except bus_set_realized() doesn't implement that, yet).

When realization of a component or bus fails, we need to roll back:
unrealize everything we realized so far.  If any of these unrealizes
failed, the device would be left in an inconsistent state.  Must not
happen.

device_set_realized() lets it happen: it ignores errors in the roll
back code starting at label child_realize_fail.

Since realization is recursive, unrealization must be recursive, too.
But how could a partly failed unrealize be rolled back?  We'd have to
re-realize, which can fail.  This design is fundamentally broken.

device_set_realized() does not roll back at all.  Instead, it keeps
unrealizing, ignoring further errors.

It can screw up even for a device with no buses: if the lone
dc->unrealize() fails, it still unregisters vmstate, and calls
listeners' unrealize() callback.

bus_set_realized() does not roll back either.  Instead, it stops
unrealizing.

Fortunately, no unrealize method can fail, as we'll see below.

To fix the design error, drop parameter @errp from all the unrealize
methods.

Any unrealize method that uses @errp now needs an update.  This leads
us to unrealize() methods that can fail.  Merely passing it to another
unrealize method cannot cause failure, though.  Here are the ones that
do other things with @errp:

* virtio_serial_device_unrealize()

  Fails when qbus_set_hotplug_handler() fails, but still does all the
  other work.  On failure, the device would stay realized with its
  resources completely gone.  Oops.  Can't happen, because
  qbus_set_hotplug_handler() can't actually fail here.  Pass
  &error_abort to qbus_set_hotplug_handler() instead.

* hw/ppc/spapr_drc.c's unrealize()

  Fails when object_property_del() fails, but all the other work is
  already done.  On failure, the device would stay realized with its
  vmstate registration gone.  Oops.  Can't happen, because
  object_property_del() can't actually fail here.  Pass &error_abort
  to object_property_del() instead.

* spapr_phb_unrealize()

  Fails and bails out when remove_drcs() fails, but other work is
  already done.  On failure, the device would stay realized with some
  of its resources gone.  Oops.  remove_drcs() fails only when
  chassis_from_bus()'s object_property_get_uint() fails, and it can't
  here.  Pass &error_abort to remove_drcs() instead.

Therefore, no unrealize method can fail before this patch.

device_set_realized()'s recursive unrealization via bus uses
object_property_set_bool().  Can't drop @errp there, so pass
&error_abort.

We similarly unrealize with object_property_set_bool() elsewhere,
always ignoring errors.  Pass &error_abort instead.

Several unrealize methods no longer handle errors from other unrealize
methods: virtio_9p_device_unrealize(),
virtio_input_device_unrealize(), scsi_qdev_unrealize(), ...
Much of the deleted error handling looks wrong anyway.

One unrealize methods no longer ignore such errors:
usb_ehci_pci_exit().

Several realize methods no longer ignore errors when rolling back:
v9fs_device_realize_common(), pci_qdev_unrealize(),
spapr_phb_realize(), usb_qdev_realize(), vfio_ccw_realize(),
virtio_device_realize().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-17-armbru@redhat.com>
2020-05-15 07:08:14 +02:00
Markus Armbruster d2623129a7 qom: Drop parameter @errp of object_property_add() & friends
The only way object_property_add() can fail is when a property with
the same name already exists.  Since our property names are all
hardcoded, failure is a programming error, and the appropriate way to
handle it is passing &error_abort.

Same for its variants, except for object_property_add_child(), which
additionally fails when the child already has a parent.  Parentage is
also under program control, so this is a programming error, too.

We have a bit over 500 callers.  Almost half of them pass
&error_abort, slightly fewer ignore errors, one test case handles
errors, and the remaining few callers pass them to their own callers.

The previous few commits demonstrated once again that ignoring
programming errors is a bad idea.

Of the few ones that pass on errors, several violate the Error API.
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL.  Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call.  ich9_pm_add_properties(), sparc32_ledma_realize(),
sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize()
are wrong that way.

When the one appropriate choice of argument is &error_abort, letting
users pick the argument is a bad idea.

Drop parameter @errp and assert the preconditions instead.

There's one exception to "duplicate property name is a programming
error": the way object_property_add() implements the magic (and
undocumented) "automatic arrayification".  Don't drop @errp there.
Instead, rename object_property_add() to object_property_try_add(),
and add the obvious wrapper object_property_add().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-15-armbru@redhat.com>
[Two semantic rebase conflicts resolved]
2020-05-15 07:07:58 +02:00
Philippe Mathieu-Daudé 78ee6bd048 various: Remove suspicious '\' character outside of #define in C code
Fixes the following coccinelle warnings:

  $ spatch --sp-file --verbose-parsing  ... \
      scripts/coccinelle/remove_local_err.cocci
  ...
  SUSPICIOUS: a \ character appears outside of a #define at ./target/ppc/translate_init.inc.c:5213
  SUSPICIOUS: a \ character appears outside of a #define at ./target/ppc/translate_init.inc.c:5261
  SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:166
  SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:167
  SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:169
  SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:170
  SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:171
  SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:172
  SUSPICIOUS: a \ character appears outside of a #define at ./target/microblaze/cpu.c:173
  SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5787
  SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5789
  SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5800
  SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5801
  SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5802
  SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5804
  SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5805
  SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:5806
  SUSPICIOUS: a \ character appears outside of a #define at ./target/i386/cpu.c:6329
  SUSPICIOUS: a \ character appears outside of a #define at ./hw/sd/sdhci.c:1133
  SUSPICIOUS: a \ character appears outside of a #define at ./hw/scsi/scsi-disk.c:3081
  SUSPICIOUS: a \ character appears outside of a #define at ./hw/net/virtio-net.c:1529
  SUSPICIOUS: a \ character appears outside of a #define at ./hw/riscv/sifive_u.c:468
  SUSPICIOUS: a \ character appears outside of a #define at ./dump/dump.c:1895
  SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2209
  SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2215
  SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2221
  SUSPICIOUS: a \ character appears outside of a #define at ./block/vhdx.c:2222
  SUSPICIOUS: a \ character appears outside of a #define at ./block/replication.c:172
  SUSPICIOUS: a \ character appears outside of a #define at ./block/replication.c:173

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20200412223619.11284-2-f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2020-04-29 08:01:51 +02:00
Xiaoyao Li d965dc3559 target/i386: Add ARCH_CAPABILITIES related bits into Icelake-Server CPU model
Current Icelake-Server CPU model lacks all the features enumerated by
MSR_IA32_ARCH_CAPABILITIES.

Add them, so that guest of "Icelake-Server" can see all of them.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20200316095605.12318-1-xiaoyao.li@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-03-31 19:13:32 -03:00
Luwei Kang ddc2fc9e4e target/i386: set the CPUID level to 0x14 on old machine-type
The CPUID level need to be set to 0x14 manually on old
machine-type if Intel PT is enabled in guest. E.g. the
CPUID[0].EAX(level)=7 and CPUID[7].EBX[25](intel-pt)=1 when the
Qemu with "-machine pc-i440fx-3.1 -cpu qemu64,+intel-pt" parameter.

Some Intel PT capabilities are exposed by leaf 0x14 and the
missing capabilities will cause some MSRs access failed.
This patch add a warning message to inform the user to extend
the CPUID level.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1584031686-16444-1-git-send-email-luwei.kang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-03-31 19:13:32 -03:00
Babu Moger 7b225762c8 i386: Fix pkg_id offset for EPYC cpu models
If the system is numa configured the pkg_offset needs
to be adjusted for EPYC cpu models. Fix it calling the
model specific handler.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <158396725589.58170.16424607815207074485.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-03-31 19:13:32 -03:00
Babu Moger 247b18c593 target/i386: Enable new apic id encoding for EPYC based cpus models
The APIC ID is decoded based on the sequence sockets->dies->cores->threads.
This works fine for most standard AMD and other vendors' configurations,
but this decoding sequence does not follow that of AMD's APIC ID enumeration
strictly. In some cases this can cause CPU topology inconsistency.

When booting a guest VM, the kernel tries to validate the topology, and finds
it inconsistent with the enumeration of EPYC cpu models. The more details are
in the bug https://bugzilla.redhat.com/show_bug.cgi?id=1728166.

To fix the problem we need to build the topology as per the Processor
Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1
Processors. The documentation is available from the bugzilla Link below.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537
It is also available at
https://www.amd.com/system/files/TechDocs/55570-B1_PUB.zip

Here is the text from the PPR.
Operating systems are expected to use Core::X86::Cpuid::SizeId[ApicIdSize], the
number of least significant bits in the Initial APIC ID that indicate core ID
within a processor, in constructing per-core CPUID masks.
Core::X86::Cpuid::SizeId[ApicIdSize] determines the maximum number of cores
(MNC) that the processor could theoretically support, not the actual number of
cores that are actually implemented or enabled on the processor, as indicated
by Core::X86::Cpuid::SizeId[NC].
Each Core::X86::Apic::ApicId[ApicId] register is preset as follows:
• ApicId[6] = Socket ID.
• ApicId[5:4] = Node ID.
• ApicId[3] = Logical CCX L3 complex ID
• ApicId[2:0]= (SMT) ? {LogicalCoreID[1:0],ThreadId} : {1'b0,LogicalCoreID[1:0]}

The new apic id encoding is enabled for EPYC and EPYC-Rome models.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <158396724913.58170.3539083528095710811.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-03-31 19:13:32 -03:00
Babu Moger 0c1538cb1a i386: Introduce use_epyc_apic_id_encoding in X86CPUDefinition
Add a boolean variable use_epyc_apic_id_encoding in X86CPUDefinition.
This will be set if this cpu model needs to use new EPYC based
apic id encoding.

Override the handlers with EPYC based handlers if use_epyc_apic_id_encoding
is set. This will be done in x86_cpus_init.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <158396723514.58170.14825482171652019765.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-03-31 19:13:32 -03:00
Babu Moger dd08ef0318 target/i386: Cleanup and use the EPYC mode topology functions
Use the new functions from topology.h and delete the unused code. Given the
sockets, nodes, cores and threads, the new functions generate apic id for EPYC
mode. Removes all the hardcoded values.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <158396722151.58170.8031705769621392927.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-03-31 19:13:32 -03:00
Babu Moger c24a41bb53 hw/i386: Update structures to save the number of nodes per package
Update structures X86CPUTopoIDs and CPUX86State to hold the number of
nodes per package. This is required to build EPYC mode topology.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <158396720035.58170.1973738805301006456.stgit@naples-babu.amd.com>
2020-03-17 19:48:10 -04:00
Babu Moger f20dec0b63 hw/i386: Consolidate topology functions
Now that we have all the parameters in X86CPUTopoInfo, we can just
pass the structure to calculate the offsets and width.

Signed-off-by: Babu Moger <babu.moger@amd.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <158396717953.58170.5628042059144117669.stgit@naples-babu.amd.com>
2020-03-17 19:48:10 -04:00
Peter Maydell 781c67ca55 cpu: Use DeviceClass reset instead of a special CPUClass reset
The CPUClass has a 'reset' method.  This is a legacy from when
TYPE_CPU used not to inherit from TYPE_DEVICE.  We don't need it any
more, as we can simply use the TYPE_DEVICE reset.  The 'cpu_reset()'
function is kept as the API which most places use to reset a CPU; it
is now a wrapper which calls device_cold_reset() and then the
tracepoint function.

This change should not cause CPU objects to be reset more often
than they are at the moment, because:
 * nobody is directly calling device_cold_reset() or
   qdev_reset_all() on CPU objects
 * no CPU object is on a qbus, so they will not be reset either
   by somebody calling qbus_reset_all()/bus_cold_reset(), or
   by the main "reset sysbus and everything in the qbus tree"
   reset that most devices are reset by

Note that this does not change the need for each machine or whatever
to use qemu_register_reset() to arrange to call cpu_reset() -- that
is necessary because CPU objects are not on any qbus, so they don't
get reset when the qbus tree rooted at the sysbus bus is reset, and
this isn't being changed here.

All the changes to the files under target/ were made using the
included Coccinelle script, except:

(1) the deletion of the now-inaccurate and not terribly useful
"CPUClass::reset" comments was done with a perl one-liner afterwards:
  perl -n -i -e '/ CPUClass::reset/ or print' target/*/*.c

(2) this bit of the s390 change was done by hand, because the
Coccinelle script is not sophisticated enough to handle the
parent_reset call being inside another function:

| @@ -96,8 +96,9 @@ static void s390_cpu_reset(CPUState *s, cpu_reset_type type)
|     S390CPU *cpu = S390_CPU(s);
|     S390CPUClass *scc = S390_CPU_GET_CLASS(cpu);
|     CPUS390XState *env = &cpu->env;
|+    DeviceState *dev = DEVICE(s);
|
|-    scc->parent_reset(s);
|+    scc->parent_reset(dev);
|     cpu->env.sigp_order = 0;
|     s390_cpu_set_state(S390_CPU_STATE_STOPPED, cpu);

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20200303100511.5498-1-peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-03-17 19:48:10 -04:00
Moger, Babu 143c30d4d3 i386: Add 2nd Generation AMD EPYC processors
Adds the support for 2nd Gen AMD EPYC Processors. The model display
name will be EPYC-Rome.

Adds the following new feature bits on top of the feature bits from the
first generation EPYC models.
perfctr-core : core performance counter extensions support. Enables the VM to
               use extended performance counter support. It enables six
               programmable counters instead of four counters.
clzero       : instruction zeroes out the 64 byte cache line specified in RAX.
xsaveerptr   : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
               pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
               pointers.
wbnoinvd     : Write back and do not invalidate cache
ibpb         : Indirect Branch Prediction Barrier
amd-stibp    : Single Thread Indirect Branch Predictor
clwb         : Cache Line Write Back and Retain
xsaves       : XSAVES, XRSTORS and IA32_XSS support
rdpid        : Read Processor ID instruction support
umip         : User-Mode Instruction Prevention support

The  Reference documents are available at
https://developer.amd.com/wp-content/resources/55803_0.54-PUB.pdf
https://www.amd.com/system/files/TechDocs/24594.pdf

Depends on following kernel commits:
40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
6d61e3c32248 ("kvm: x86: Expose RDPID in KVM_GET_SUPPORTED_CPUID")
52297436199d ("kvm: svm: Update svm_xsaves_supported")

Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <157314966312.23828.17684821666338093910.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-03-17 19:48:10 -04:00
Moger, Babu a16e8dbc04 i386: Add missing cpu feature bits in EPYC model
Adds the following missing CPUID bits:
perfctr-core : core performance counter extensions support. Enables the VM
               to use extended performance counter support. It enables six
               programmable counters instead of 4 counters.
clzero       : instruction zeroes out the 64 byte cache line specified in RAX.
xsaveerptr   : XSAVE, XSAVE, FXSAVEOPT, XSAVEC, XSAVES always save error
               pointers and FXRSTOR, XRSTOR, XRSTORS always restore error
               pointers.
ibpb         : Indirect Branch Prediction Barrie.
xsaves       : XSAVES, XRSTORS and IA32_XSS supported.

Depends on following kernel commits:
40bc47b08b6e ("kvm: x86: Enumerate support for CLZERO instruction")
504ce1954fba ("KVM: x86: Expose XSAVEERPTR to the guest")
52297436199d ("kvm: svm: Update svm_xsaves_supported")

These new features will be added in EPYC-v3. The -cpu help output after the change.
x86 EPYC-v1               AMD EPYC Processor
x86 EPYC-v2               AMD EPYC Processor (with IBPB)
x86 EPYC-v3               AMD EPYC Processor

Signed-off-by: Babu Moger <babu.moger@amd.com>
Message-Id: <157314965662.23828.3063243729449408327.stgit@naples-babu.amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-03-17 19:48:10 -04:00
Tao Xu c63938df0a target/i386: Add new property note to versioned CPU models
Add additional information for -cpu help to indicate the changes in this
version of CPU model.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20200212081328.7385-4-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-03-17 19:48:10 -04:00
Tao Xu ab0c942c86 target/i386: Add Denverton-v2 (no MPX) CPU model
Because MPX is being removed from the linux kernel, remove MPX feature
from Denverton.

Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20200212081328.7385-2-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2020-03-17 19:48:10 -04:00
Paolo Bonzini be02cda3af target/i386: enable monitor and ucode revision with -cpu max
These two features were incorrectly tied to host_cpuid_required rather than
cpu->max_features.  As a result, -cpu max was not enabling either MONITOR
features or ucode revision.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-02-12 16:29:51 +01:00
Kashyap Chamarthy 673b0add9e target/i386: Add the 'model-id' for Skylake -v3 CPU models
This fixes a confusion in the help output.  (Although, if you squint
long enough at the '-cpu help' output, you _do_ notice that
"Skylake-Client-noTSX-IBRS" is an alias of "Skylake-Client-v3";
similarly for Skylake-Server-v3.)

Without this patch:

    $ qemu-system-x86 -cpu help
    ...
    x86 Skylake-Client-v1     Intel Core Processor (Skylake)
    x86 Skylake-Client-v2     Intel Core Processor (Skylake, IBRS)
    x86 Skylake-Client-v3     Intel Core Processor (Skylake, IBRS)
    ...
    x86 Skylake-Server-v1     Intel Xeon Processor (Skylake)
    x86 Skylake-Server-v2     Intel Xeon Processor (Skylake, IBRS)
    x86 Skylake-Server-v3     Intel Xeon Processor (Skylake, IBRS)
    ...

With this patch:

    $ ./qemu-system-x86 -cpu help
    ...
    x86 Skylake-Client-v1     Intel Core Processor (Skylake)
    x86 Skylake-Client-v2     Intel Core Processor (Skylake, IBRS)
    x86 Skylake-Client-v3     Intel Core Processor (Skylake, IBRS, no TSX)
    ...
    x86 Skylake-Server-v1     Intel Xeon Processor (Skylake)
    x86 Skylake-Server-v2     Intel Xeon Processor (Skylake, IBRS)
    x86 Skylake-Server-v3     Intel Xeon Processor (Skylake, IBRS, no TSX)
    ...

Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-Id: <20200123090116.14409-1-kchamart@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 20:59:17 +01:00
Marc-André Lureau 4f67d30b5e qdev: set properties with device_class_set_props()
The following patch will need to handle properties registration during
class_init time. Let's use a device_class_set_props() setter.

spatch --macro-file scripts/cocci-macro-file.h  --sp-file
./scripts/coccinelle/qdev-set-props.cocci --keep-comments --in-place
--dir .

@@
typedef DeviceClass;
DeviceClass *d;
expression val;
@@
- d->props = val
+ device_class_set_props(d, val)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20200110153039.1379601-20-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 20:59:15 +01:00
Paolo Bonzini 32c87d70ff target/i386: kvm: initialize microcode revision from KVM
KVM can return the host microcode revision as a feature MSR.
Use it as the default value for -cpu host.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1579544504-3616-4-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 20:59:10 +01:00
Paolo Bonzini 4e45aff398 target/i386: add a ucode-rev property
Add the property and plumb it in TCG and HVF (the latter of which
tried to support returning a constant value but used the wrong MSR).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1579544504-3616-3-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 20:59:09 +01:00
Greg Kurz bc9888f759 cpu: Use cpu_class_set_parent_reset()
Convert all targets to use cpu_class_set_parent_reset() with the following
coccinelle script:

@@
type CPUParentClass;
CPUParentClass *pcc;
CPUClass *cc;
identifier parent_fn;
identifier child_fn;
@@
+cpu_class_set_parent_reset(cc, child_fn, &pcc->parent_fn);
-pcc->parent_fn = cc->reset;
...
-cc->reset = child_fn;

Signed-off-by: Greg Kurz <groug@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Message-Id: <157650847817.354886.7047137349018460524.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 20:59:06 +01:00
Xiaoyao Li 2dea9d9ca4 target/i386: Add missed features to Cooperlake CPU model
It lacks VMX features and two security feature bits (disclosed recently) in
MSR_IA32_ARCH_CAPABILITIES in current Cooperlake CPU model, so add them.

Fixes: 22a866b616 ("i386: Add new CPU model Cooperlake")
Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20191225063018.20038-3-xiaoyao.li@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-07 14:31:03 +01:00
Peter Maydell 6fb0dae9ef x86 and machine queue, 2019-12-20
Bug fix:
 * Resolve CPU models to v1 by default (Eduardo Habkost)
 
 Cleanup:
 * Remove incorrect numa_mem_supported checks (Igor Mammedov)
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEWjIv1avE09usz9GqKAeTb5hNxaYFAl39HqYUHGVoYWJrb3N0
 QHJlZGhhdC5jb20ACgkQKAeTb5hNxabHeBAAkybU8+JzzqXoG9e16MiQiUQ0vSy9
 MFkWIlsD5RCncdlI7s7yyuPUa7GEkJztRxzanvP2BcbMvHHpaM01EgOsZuZfld8Z
 R6lQaTZdAC4XQFPmD14ccIQ/r8cDUXRfUhasKXq3tNQdXORUw5/T9XHwyn3kvHUT
 /nEglWdUG0LmRQMQNRpbSgQ4B0jx+RwRg6KLGRm/mqlwiFV8nULLB8IYDMrxHSu3
 iY/PAFOMqQbRbbDQ7rK3l7u0TyRTB41FTx8s2eT9Is2V3HZU9P9lbWPQBMnxPwxm
 VYo/LVO6smZ9gZbyCcPZtOn95ay5gGk+fQ9Twg6/l1tHsK7vmNxn8Z3y+QWEvJ30
 BnOJ2Y0RaFNBDrhiIqJu12Lp0nJXMDi96tAS71hqwsJssjzLYSpD/faoKO0vDyR9
 RLoumrXLcrgeMopRKsft8ZkJIakHlXc+85AuIMZ9obhcz4liC7r/IbjOqOumKTPN
 8feLmzqdldAmh0jvJCfyu1n4qhH4KUPPrFxOvZfuzdWkvSUbcJSkQaPwYxxQaFvo
 9jRHwNNF4MTnImgQIw59ao/u6JXVM+4oY5dc+BjeGTefQKuwRRvT/54Z+v7jULwK
 ZKGlLnCRlYeD/U+67iBIeV2nrRM7pTkcsTWmhX+/u2pwyKpmiA4quG63KmR7dyDK
 6HqJez6jOKTEARU=
 =psTk
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-and-machine-pull-request' into staging

x86 and machine queue, 2019-12-20

Bug fix:
* Resolve CPU models to v1 by default (Eduardo Habkost)

Cleanup:
* Remove incorrect numa_mem_supported checks (Igor Mammedov)

# gpg: Signature made Fri 20 Dec 2019 19:19:02 GMT
# gpg:                using RSA key 5A322FD5ABC4D3DBACCFD1AA2807936F984DC5A6
# gpg:                issuer "ehabkost@redhat.com"
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full]
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-and-machine-pull-request:
  numa: properly check if numa is supported
  numa: remove not needed check
  i386: Resolve CPU models to v1 by default

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2020-01-06 14:08:04 +00:00
Eduardo Habkost ad18392892 i386: Resolve CPU models to v1 by default
When using `query-cpu-definitions` using `-machine none`,
QEMU is resolving all CPU models to their latest versions.  The
actual CPU model version being used by another machine type (e.g.
`pc-q35-4.0`) might be different.

In theory, this was OK because the correct CPU model
version is returned when using the correct `-machine` argument.

Except that in practice, this breaks libvirt expectations:
libvirt always use `-machine none` when checking if a CPU model
is runnable, because runnability is not expected to be affected
when the machine type is changed.

For example, when running on a Haswell host without TSX,
Haswell-v4 is runnable, but Haswell-v1 is not.  On those hosts,
`query-cpu-definitions` says Haswell is runnable if using
`-machine none`, but Haswell is actually not runnable using any
of the `pc-*` machine types (because they resolve Haswell to
Haswell-v1).  In other words, we're breaking the "runnability
guarantee" we promised to not break for a few releases (see
qemu-deprecated.texi).

To address this issue, change the default CPU model version to v1
on all machine types, so we make `query-cpu-definitions` output
when using `-machine none` match the results when using `pc-*`.
This will change in the future (the plan is to always return the
latest CPU model version if using `-machine none`), but only
after giving libvirt the opportunity to adapt.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1779078
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20191205223339.764534-1-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-12-19 14:38:51 -03:00
Paolo Bonzini 3c75e12ea6 qom: add object_new_with_class
Similar to CPU and machine classes, "-accel" class names are mangled,
so we have to first get a class via accel_find and then instantiate it.
Provide a new function to instantiate a class without going through
object_class_get_name, and use it for CPUs and machines already.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-12-17 19:32:26 +01:00
Eduardo Habkost 88703ce2e6 i386: Use g_autofree in a few places
Get rid of 12 explicit g_free() calls.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20191025025632.5928-1-ehabkost@redhat.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-12-13 16:32:19 -03:00
Cathy Zhang 22a866b616 i386: Add new CPU model Cooperlake
Cooper Lake is intel's successor to Cascade Lake, the new
CPU model inherits features from Cascadelake-Server, while
add one platform associated new feature: AVX512_BF16. Meanwhile,
add STIBP for speculative execution.

Signed-off-by: Cathy Zhang <cathy.zhang@intel.com>
Reviewed-by: Xiaoyao Li <xiaoyao.li@intel.com>
Reviewed-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <1571729728-23284-4-git-send-email-cathy.zhang@intel.com>
Reviewed-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-12-13 16:32:19 -03:00
Paolo Bonzini c6f3215ffa target/i386: add two missing VMX features for Skylake and CascadeLake Server
They are present in client (Core) Skylake but pasted wrong into the server
SKUs.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-26 09:55:12 +01:00
Eduardo Habkost 02fa60d101 i386: Add -noTSX aliases for hle=off, rtm=off CPU models
We have been trying to avoid adding new aliases for CPU model
versions, but in the case of changes in defaults introduced by
the TAA mitigation patches, the aliases might help avoid user
confusion when applying host software updates.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 16:35:05 +01:00
Eduardo Habkost 9ab2237f19 i386: Add new versions of Skylake/Cascadelake/Icelake without TSX
One of the mitigation methods for TAA[1] is to disable TSX
support on the host system.  Linux added a mechanism to disable
TSX globally through the kernel command line, and many Linux
distributions now default to tsx=off.  This makes existing CPU
models that have HLE and RTM enabled not usable anymore.

Add new versions of all CPU models that have the HLE and RTM
features enabled, that can be used when TSX is disabled in the
host system.

References:

[1] TAA, TSX asynchronous Abort:
    https://software.intel.com/security-software-guidance/insights/deep-dive-intel-transactional-synchronization-extensions-intel-tsx-asynchronous-abort
    https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/tsx_async_abort.html

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 16:35:05 +01:00
Paolo Bonzini 2a9758c51e target/i386: add support for MSR_IA32_TSX_CTRL
The MSR_IA32_TSX_CTRL MSR can be used to hide TSX (also known as the
Trusty Side-channel Extension).  By virtualizing the MSR, KVM guests
can disable TSX and avoid paying the price of mitigating TSX-based
attacks on microarchitectural side channels.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 16:35:05 +01:00
Paolo Bonzini 0723cc8a55 target/i386: add VMX features to named CPU models
This allows using "-cpu Haswell,+vmx", which we did not really want to
support in QEMU but was produced by Libvirt when using the "host-model"
CPU model.  Without this patch, no VMX feature is _actually_ supported
(only the basic instruction set extensions are) and KVM fails to load
in the guest.

This was produced from the output of scripts/kvm/vmxcap using the following
very ugly Python script:

    bits = {
            'INS/OUTS instruction information': ['FEAT_VMX_BASIC', 'MSR_VMX_BASIC_INS_OUTS'],
            'IA32_VMX_TRUE_*_CTLS support': ['FEAT_VMX_BASIC', 'MSR_VMX_BASIC_TRUE_CTLS'],
            'External interrupt exiting': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_EXT_INTR_MASK'],
            'NMI exiting': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_NMI_EXITING'],
            'Virtual NMIs': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_VIRTUAL_NMIS'],
            'Activate VMX-preemption timer': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_VMX_PREEMPTION_TIMER'],
            'Process posted interrupts': ['FEAT_VMX_PINBASED_CTLS', 'VMX_PIN_BASED_POSTED_INTR'],
            'Interrupt window exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_VIRTUAL_INTR_PENDING'],
            'Use TSC offsetting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_TSC_OFFSETING'],
            'HLT exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_HLT_EXITING'],
            'INVLPG exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_INVLPG_EXITING'],
            'MWAIT exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MWAIT_EXITING'],
            'RDPMC exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_RDPMC_EXITING'],
            'RDTSC exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_RDTSC_EXITING'],
            'CR3-load exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR3_LOAD_EXITING'],
            'CR3-store exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR3_STORE_EXITING'],
            'CR8-load exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR8_LOAD_EXITING'],
            'CR8-store exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_CR8_STORE_EXITING'],
            'Use TPR shadow': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_TPR_SHADOW'],
            'NMI-window exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_VIRTUAL_NMI_PENDING'],
            'MOV-DR exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MOV_DR_EXITING'],
            'Unconditional I/O exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_UNCOND_IO_EXITING'],
            'Use I/O bitmaps': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_IO_BITMAPS'],
            'Monitor trap flag': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MONITOR_TRAP_FLAG'],
            'Use MSR bitmaps': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_USE_MSR_BITMAPS'],
            'MONITOR exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_MONITOR_EXITING'],
            'PAUSE exiting': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_PAUSE_EXITING'],
            'Activate secondary control': ['FEAT_VMX_PROCBASED_CTLS', 'VMX_CPU_BASED_ACTIVATE_SECONDARY_CONTROLS'],
            'Virtualize APIC accesses': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES'],
            'Enable EPT': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_EPT'],
            'Descriptor-table exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_DESC'],
            'Enable RDTSCP': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDTSCP'],
            'Virtualize x2APIC mode': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE'],
            'Enable VPID': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_VPID'],
            'WBINVD exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_WBINVD_EXITING'],
            'Unrestricted guest': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_UNRESTRICTED_GUEST'],
            'APIC register emulation': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_APIC_REGISTER_VIRT'],
            'Virtual interrupt delivery': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY'],
            'PAUSE-loop exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_PAUSE_LOOP_EXITING'],
            'RDRAND exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDRAND_EXITING'],
            'Enable INVPCID': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_INVPCID'],
            'Enable VM functions': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_VMFUNC'],
            'VMCS shadowing': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_SHADOW_VMCS'],
            'RDSEED exiting': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_RDSEED_EXITING'],
            'Enable PML': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_ENABLE_PML'],
            'Enable XSAVES/XRSTORS': ['FEAT_VMX_SECONDARY_CTLS', 'VMX_SECONDARY_EXEC_XSAVES'],
            'Save debug controls': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_DEBUG_CONTROLS'],
            'Load IA32_PERF_GLOBAL_CTRL': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_PERF_GLOBAL_CTRL'],
            'Acknowledge interrupt on exit': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_ACK_INTR_ON_EXIT'],
            'Save IA32_PAT': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_IA32_PAT'],
            'Load IA32_PAT': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_PAT'],
            'Save IA32_EFER': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_IA32_EFER'],
            'Load IA32_EFER': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_LOAD_IA32_EFER'],
            'Save VMX-preemption timer value': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_SAVE_VMX_PREEMPTION_TIMER'],
            'Clear IA32_BNDCFGS': ['FEAT_VMX_EXIT_CTLS', 'VMX_VM_EXIT_CLEAR_BNDCFGS'],
            'Load debug controls': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_DEBUG_CONTROLS'],
            'IA-32e mode guest': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_IA32E_MODE'],
            'Load IA32_PERF_GLOBAL_CTRL': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL'],
            'Load IA32_PAT': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_PAT'],
            'Load IA32_EFER': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_IA32_EFER'],
            'Load IA32_BNDCFGS': ['FEAT_VMX_ENTRY_CTLS', 'VMX_VM_ENTRY_LOAD_BNDCFGS'],
            'Store EFER.LMA into IA-32e mode guest control': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_STORE_LMA'],
            'HLT activity state': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_ACTIVITY_HLT'],
            'VMWRITE to VM-exit information fields': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_VMWRITE_VMEXIT'],
            'Inject event with insn length=0': ['FEAT_VMX_MISC', 'MSR_VMX_MISC_ZERO_LEN_INJECT'],
            'Execute-only EPT translations': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_EXECONLY'],
            'Page-walk length 4': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_PAGE_WALK_LENGTH_4'],
            'Paging-structure memory type WB': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_WB'],
            '2MB EPT pages': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_2MB | MSR_VMX_EPT_1GB'],
            'INVEPT supported': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT'],
            'EPT accessed and dirty flags': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_AD_BITS'],
            'Single-context INVEPT': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT_SINGLE_CONTEXT'],
            'All-context INVEPT': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVEPT_ALL_CONTEXT'],
            'INVVPID supported': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID'],
            'Individual-address INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_ADDR'],
            'Single-context INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT'],
            'All-context INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_ALL_CONTEXT'],
            'Single-context-retaining-globals INVVPID': ['FEAT_VMX_EPT_VPID_CAPS', 'MSR_VMX_EPT_INVVPID_SINGLE_CONTEXT_NOGLOBALS'],
            'EPTP Switching': ['FEAT_VMX_VMFUNC', 'MSR_VMX_VMFUNC_EPT_SWITCHING']
    }

    import sys
    import textwrap

    out = {}
    for l in sys.stdin.readlines():
        l = l.rstrip()
        if l.endswith('!!'):
            l = l[:-2].rstrip()
        if l.startswith('    ') and (l.endswith('default') or l.endswith('yes')):
            l = l[4:]
            for key, value in bits.items():
                if l.startswith(key):
                    ctl, bit = value
                    if ctl in out:
                        out[ctl] = out[ctl] + ' | '
                    else:
                        out[ctl] = '    [%s] = ' % ctl
                    out[ctl] = out[ctl] + bit

    for x in sorted(out.keys()):
        print("\n         ".join(textwrap.wrap(out[x] + ",")))

Note that the script has a bug in that some keys apply to both VM entry
and VM exit controls ("load IA32_PERF_GLOBAL_CTRL", "load IA32_EFER",
"load IA32_PAT".  Those have to be fixed by hand.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-21 16:33:53 +01:00
Pawan Gupta 7fac38635e target/i386: Export TAA_NO bit to guests
TSX Async Abort (TAA) is a side channel attack on internal buffers in
some Intel processors similar to Microachitectural Data Sampling (MDS).

Some future Intel processors will use the ARCH_CAP_TAA_NO bit in the
IA32_ARCH_CAPABILITIES MSR to report that they are not vulnerable to
TAA. Make this bit available to guests.

Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-19 10:01:32 +01:00
Paolo Bonzini 7f7a585d5b target/i386: add PSCHANGE_NO bit for the ARCH_CAPABILITIES MSR
This is required to disable ITLB multihit mitigations in nested
hypervisors.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-11-19 10:00:36 +01:00
Paolo Bonzini 673652a785 Merge commit 'df84f17' into HEAD
This merge fixes a semantic conflict with the trivial tree.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-26 15:38:02 +02:00
Tao Xu 8b44d8609f target/i386: Introduce Denverton CPU model
Denverton is the Atom Processor of Intel Harrisonville platform.

For more information:
https://ark.intel.com/content/www/us/en/ark/products/\
codename/63508/denverton.html

Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190718073405.28301-1-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-23 23:37:42 -03:00
Tao Xu 67192a298f x86/cpu: Add support for UMONITOR/UMWAIT/TPAUSE
UMONITOR, UMWAIT and TPAUSE are a set of user wait instructions.
This patch adds support for user wait instructions in KVM. Availability
of the user wait instructions is indicated by the presence of the CPUID
feature flag WAITPKG CPUID.0x07.0x0:ECX[5]. User wait instructions may
be executed at any privilege level, and use IA32_UMWAIT_CONTROL MSR to
set the maximum time.

The patch enable the umonitor, umwait and tpause features in KVM.
Because umwait and tpause can put a (psysical) CPU into a power saving
state, by default we dont't expose it to kvm and enable it only when
guest CPUID has it. And use QEMU command-line "-overcommit cpu-pm=on"
(enable_cpu_pm is enabled), a VM can use UMONITOR, UMWAIT and TPAUSE
instructions. If the instruction causes a delay, the amount of time
delayed is called here the physical delay. The physical delay is first
computed by determining the virtual delay (the time to delay relative to
the VM’s timestamp counter). Otherwise, UMONITOR, UMWAIT and TPAUSE cause
an invalid-opcode exception(#UD).

The release document ref below link:
https://software.intel.com/sites/default/files/\
managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf

Co-developed-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Jingqi Liu <jingqi.liu@intel.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20191011074103.30393-2-tao3.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-23 17:50:27 +02:00
Vitaly Kuznetsov 30d6ff662d i386/kvm: add NoNonArchitecturalCoreSharing Hyper-V enlightenment
Hyper-V TLFS specifies this enlightenment as:
"NoNonArchitecturalCoreSharing - Indicates that a virtual processor will never
share a physical core with another virtual processor, except for virtual
processors that are reported as sibling SMT threads. This can be used as an
optimization to avoid the performance overhead of STIBP".

However, STIBP is not the only implication. It was found that Hyper-V on
KVM doesn't pass MD_CLEAR bit to its guests if it doesn't see
NoNonArchitecturalCoreSharing bit.

KVM reports NoNonArchitecturalCoreSharing in KVM_GET_SUPPORTED_HV_CPUID to
indicate that SMT on the host is impossible (not supported of forcefully
disabled).

Implement NoNonArchitecturalCoreSharing support in QEMU as tristate:
'off' - the feature is disabled (default)
'on' - the feature is enabled. This is only safe if vCPUS are properly
 pinned and correct topology is exposed. As CPU pinning is done outside
 of QEMU the enablement decision will be made on a higher level.
'auto' - copy KVM setting. As during live migration SMT settings on the
source and destination host may differ this requires us to add a migration
blocker.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20191018163908.10246-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-22 09:38:42 +02:00
Xiaoyao Li 69edb0f37a target/i386: Add Snowridge-v2 (no MPX) CPU model
Add new version of Snowridge CPU model that removes MPX feature.

MPX support is being phased out by Intel. GCC has dropped it, Linux kernel
and KVM are also going to do that in the future.

Signed-off-by: Xiaoyao Li <xiaoyao.li@intel.com>
Message-Id: <20191012024748.127135-1-xiaoyao.li@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15 18:34:44 -03:00
Bingsong Si 76ecd7a514 i386: Fix legacy guest with xsave panic on host kvm without update cpuid.
without kvm commit 412a3c41, CPUID(EAX=0xd,ECX=0).EBX always equal to 0 even
through guest update xcr0, this will crash legacy guest(e.g., CentOS 6).
Below is the call trace on the guest.

[    0.000000] kernel BUG at mm/bootmem.c:469!
[    0.000000] invalid opcode: 0000 [#1] SMP
[    0.000000] last sysfs file:
[    0.000000] CPU 0
[    0.000000] Modules linked in:
[    0.000000]
[    0.000000] Pid: 0, comm: swapper Tainted: G           --------------- H  2.6.32-279#2 Red Hat KVM
[    0.000000] RIP: 0010:[<ffffffff81c4edc4>]  [<ffffffff81c4edc4>] alloc_bootmem_core+0x7b/0x29e
[    0.000000] RSP: 0018:ffffffff81a01cd8  EFLAGS: 00010046
[    0.000000] RAX: ffffffff81cb1748 RBX: ffffffff81cb1720 RCX: 0000000001000000
[    0.000000] RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffffffff81cb1720
[    0.000000] RBP: ffffffff81a01d38 R08: 0000000000000000 R09: 0000000000001000
[    0.000000] R10: 02008921da802087 R11: 00000000ffff8800 R12: 0000000000000000
[    0.000000] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000001000000
[    0.000000] FS:  0000000000000000(0000) GS:ffff880002200000(0000) knlGS:0000000000000000
[    0.000000] CS:  0010 DS: 0018 ES: 0018 CR0: 0000000080050033
[    0.000000] CR2: 0000000000000000 CR3: 0000000001a85000 CR4: 00000000001406b0
[    0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.000000] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    0.000000] Process swapper (pid: 0, threadinfo ffffffff81a00000, task ffffffff81a8d020)
[    0.000000] Stack:
[    0.000000]  0000000000000002 81a01dd881eaf060 000000007e5fe227 0000000000001001
[    0.000000] <d> 0000000000000040 0000000000000001 0000006cffffffff 0000000001000000
[    0.000000] <d> ffffffff81cb1720 0000000000000000 0000000000000000 0000000000000000
[    0.000000] Call Trace:
[    0.000000]  [<ffffffff81c4f074>] ___alloc_bootmem_nopanic+0x8d/0xca
[    0.000000]  [<ffffffff81c4f0cf>] ___alloc_bootmem+0x11/0x39
[    0.000000]  [<ffffffff81c4f172>] __alloc_bootmem+0xb/0xd
[    0.000000]  [<ffffffff814d42d9>] xsave_cntxt_init+0x249/0x2c0
[    0.000000]  [<ffffffff814e0689>] init_thread_xstate+0x17/0x25
[    0.000000]  [<ffffffff814e0710>] fpu_init+0x79/0xaa
[    0.000000]  [<ffffffff814e27e3>] cpu_init+0x301/0x344
[    0.000000]  [<ffffffff81276395>] ? sort+0x155/0x230
[    0.000000]  [<ffffffff81c30cf2>] trap_init+0x24e/0x25f
[    0.000000]  [<ffffffff81c2bd73>] start_kernel+0x21c/0x430
[    0.000000]  [<ffffffff81c2b33a>] x86_64_start_reservations+0x125/0x129
[    0.000000]  [<ffffffff81c2b438>] x86_64_start_kernel+0xfa/0x109
[    0.000000] Code: 03 48 89 f1 49 c1 e8 0c 48 0f af d0 48 c7 c6 00 a6 61 81 48 c7 c7 00 e5 79 81 31 c0 4c 89 74 24 08 e8 f2 d7 89 ff 4d 85 e4 75 04 <0f> 0b eb fe 48 8b 45 c0 48 83 e8 01 48 85 45
c0 74 04 0f 0b eb

Signed-off-by: Bingsong Si <owen.si@ucloud.cn>
Message-Id: <20190822042901.16858-1-owen.si@ucloud.cn>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15 18:34:44 -03:00
Tao Xu e7694a5eae target/i386: drop the duplicated definition of cpuid AVX512_VBMI macro
Drop the duplicated definition of cpuid AVX512_VBMI macro and rename
it as CPUID_7_0_ECX_AVX512_VBMI. Rename CPUID_7_0_ECX_VBMI2 as
CPUID_7_0_ECX_AVX512_VBMI2.

Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190926021055.6970-3-tao3.xu@intel.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15 18:34:44 -03:00
Paolo Bonzini 20a78b02d3 target/i386: add VMX features
Add code to convert the VMX feature words back into MSR values,
allowing the user to enable/disable VMX features as they wish.  The same
infrastructure enables support for limiting VMX features in named
CPU models.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:20 +02:00
Paolo Bonzini ede146c2e7 target/i386: expand feature words to 64 bits
VMX requires 64-bit feature words for the IA32_VMX_EPT_VPID_CAP
and IA32_VMX_BASIC MSRs.  (The VMX control MSRs are 64-bit wide but
actually have only 32 bits of information).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:19 +02:00
Paolo Bonzini 99e24dbdaa target/i386: introduce generic feature dependency mechanism
Sometimes a CPU feature does not make sense unless another is
present.  In the case of VMX features, KVM does not even allow
setting the VMX controls to some invalid combinations.

Therefore, this patch adds a generic mechanism that looks for bits
that the user explicitly cleared, and uses them to remove other bits
from the expanded CPU definition.  If these dependent bits were also
explicitly *set* by the user, this will be a warning for "-cpu check"
and an error for "-cpu enforce".  If not, then the dependent bits are
cleared silently, for convenience.

With VMX features, this will be used so that for example
"-cpu host,-rdrand" will also hide support for RDRAND exiting.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:19 +02:00
Paolo Bonzini 245edd0cfb target/i386: handle filtered_features in a new function mark_unavailable_features
The next patch will add a different reason for filtering features, unrelated
to host feature support.  Extract a new function that takes care of disabling
the features and optionally reporting them.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:19 +02:00
Dmitry Poletaev 56f997500a Fix wrong behavior of cpu_memory_rw_debug() function in SMM
There is a problem, that you don't have access to the data using cpu_memory_rw_debug() function when in SMM. You can't remotely debug SMM mode program because of that for example.
Likely attrs version of get_phys_page_debug should be used to get correct asidx at the end to handle access properly.
Here the patch to fix it.

Signed-off-by: Dmitry Poletaev <poletaev@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:18 +02:00
Sebastian Andrzej Siewior e900135dcf i386: Add CPUID bit for CLZERO and XSAVEERPTR
The CPUID bits CLZERO and XSAVEERPTR are availble on AMD's ZEN platform
and could be passed to the guest.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:17 +02:00
Jing Liu 80db491da4 x86: Intel AVX512_BF16 feature enabling
Intel CooperLake cpu adds AVX512_BF16 instruction, defining as
CPUID.(EAX=7,ECX=1):EAX[bit 05].

The patch adds a property for setting the subleaf of CPUID leaf 7 in
case that people would like to specify it.

The release spec link as follows,
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf

Signed-off-by: Jing Liu <jing2.liu@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20 20:00:52 +02:00
Wanpeng Li b896c4b50d target-i386: adds PV_SCHED_YIELD CPUID feature bit
Adds PV_SCHED_YIELD CPUID feature bit.

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1562745771-8414-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20 17:26:18 +02:00
Marcelo Tosatti d645e13287 kvm: i386: halt poll control MSR support
Add support for halt poll control MSR: save/restore, migration
and new feature name.

The purpose of this MSR is to allow the guest to disable
host halt poll.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20190603230408.GA7938@amt.cnet>
[Do not enable by default, as pointed out by Mark Kanda. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-20 17:26:17 +02:00
Markus Armbruster 650d103d3e Include hw/hw.h exactly where needed
In my "build everything" tree, changing hw/hw.h triggers a recompile
of some 2600 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).

The previous commits have left only the declaration of hw_error() in
hw/hw.h.  This permits dropping most of its inclusions.  Touching it
now recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-08-16 13:31:52 +02:00
Markus Armbruster 71e8a91585 Include sysemu/reset.h a lot less
In my "build everything" tree, changing sysemu/reset.h triggers a
recompile of some 2600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).

The main culprit is hw/hw.h, which supposedly includes it for
convenience.

Include sysemu/reset.h only where it's needed.  Touching it now
recompiles less than 200 objects.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-9-armbru@redhat.com>
2019-08-16 13:31:52 +02:00
Paul Lai ff656fcd33 i386: Fix Snowridge CPU model name and features
Changing the name to Snowridge from SnowRidge-Server.
There is no client model of Snowridge, so "-Server" is unnecessary.

Removing CPUID_EXT_VMX from Snowridge cpu feature list.

Signed-off-by: Paul Lai <paul.c.lai@intel.com>
Tested-by: Tao3 Xu <tao3.xu@intel.com>
Message-Id: <20190716155808.25010-1-paul.c.lai@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-29 13:08:02 -03:00
Denis V. Lunev 2924ab02c2 i386: indicate that 'pconfig' feature was removed intentionally
pconfig feature was added in 5131dc433d and removed in 712f807e19.
This patch mark this feature as known to QEMU and removed by
intentinally. This follows the convention of 9ccb9784b5 and f1a23522b0
dealing with 'osxsave' and 'ospke'.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190719111222.14943-1-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-07-19 23:45:28 +02:00
Eduardo Habkost fd63c6d1a5 i386: Add Cascadelake-Server-v2 CPU model
Add new version of Cascadelake-Server CPU model, setting
stepping=5 and enabling the IA32_ARCH_CAPABILITIES MSR
with some flags.

The new feature will introduce a new host software requirement,
breaking our CPU model runnability promises.  This means we can't
enable the new CPU model version by default in QEMU 4.1, because
management software isn't ready yet to resolve CPU model aliases.
This is why "pc-*-4.1" will keep returning Cascadelake-Server-v1
if "-cpu Cascadelake-Server" is specified.

Includes a test case to ensure the right combinations of
machine-type + CPU model + command-line feature flags will work
as expected.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-10-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20190703221723.8161-1-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:12:30 -03:00
Eduardo Habkost 0788a56bd1 i386: Make unversioned CPU models be aliases
This will make unversioned CPU models behavior depend on the
machine type:

* "pc-*-4.0" and older will not report them as aliases.
  This is done to keep compatibility with older QEMU versions
  after management software starts translating aliases.

* "pc-*-4.1" will translate unversioned CPU models to -v1.
  This is done to keep compatibility with existing management
  software, that still relies on CPU model runnability promises.

* "none" will translate unversioned CPU models to their latest
  version.  This is planned become the default in future machine
  types (probably in pc-*-4.3).

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-8-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Eduardo Habkost 53db89d93b i386: Replace -noTSX, -IBRS, -IBPB CPU models with aliases
The old CPU models will be just aliases for specific versions of
the original CPU models.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-7-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Eduardo Habkost d86a708815 i386: Define -IBRS, -noTSX, -IBRS versions of CPU models
Add versions of CPU models that are equivalent to their -IBRS,
-noTSX and -IBRS variants.

The separate variants will eventually be removed and become
aliases for these CPU versions.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-6-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Eduardo Habkost dcafd1ef0a i386: Register versioned CPU models
Add support for registration of multiple versions of CPU models.

The existing CPU models will be registered with a "-v1" suffix.

The -noTSX, -IBRS, and -IBPB CPU model variants will become
versions of the original models in a separate patch, so
make sure we register no versions for them.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-5-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Eduardo Habkost 164e779ce1 i386: Get model-id from CPU object on "-cpu help"
When introducing versioned CPU models, the string at
X86CPUDefinition::model_id might not be the model-id we'll really
use.  Instantiate a CPU object and check the model-id property on
"-cpu help"

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-4-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Eduardo Habkost dac1deae65 i386: Add x-force-features option for testing
Add a new option that can be used to disable feature flag
filtering.  This will allow CPU model compatibility test cases to
work without host hardware dependencies.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190628002844.24894-3-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Paul Lai 0b18874bd2 i386: Introduce SnowRidge CPU model
SnowRidge CPU supports Accelerator Infrastrcture Architecture (MOVDIRI,
MOVDIR64B), CLDEMOTE and SPLIT_LOCK_DISABLE.

MOVDIRI, MOVDIR64B, and CLDEMOTE are found via CPUID.
The availability of SPLIT_LOCK_DISABLE is check via msr access

References can be found in either:
 https://software.intel.com/en-us/articles/intel-sdm
 https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-and-future-features-programming-reference

Signed-off-by: Paul Lai <paul.c.lai@intel.com>
Tested-by: Tao3 Xu <tao3.xu@intel.com>
Message-Id: <20190626162129.25345-1-paul.c.lai@intel.com>
[ehabkost: squashed SPLIT_LOCK_DETECT patch]
Message-Id: <20190626163232.25711-1-paul.c.lai@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Like Xu a94e142899 target/i386: Add CPUID.1F generation support for multi-dies PCMachine
The CPUID.1F as Intel V2 Extended Topology Enumeration Leaf would be
exposed if guests want to emulate multiple software-visible die within
each package. Per Intel's SDM, the 0x1f is a superset of 0xb, thus they
can be generated by almost same code as 0xb except die_offset setting.

If the number of dies per package is greater than 1, the cpuid_min_level
would be adjusted to 0x1f regardless of whether the host supports CPUID.1F.
Likewise, the CPUID.1F wouldn't be exposed if env->nr_dies < 2.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190620054525.37188-2-like.xu@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Eduardo Habkost 1c809535e3 i386: Remove unused host_cpudef variable
The variable is completely unused, probably a leftover from
previous code clean up.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190625050008.12789-3-ehabkost@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:04 -03:00
Roman Kagan 915aee93e7 i386: make 'hv-spinlocks' a regular uint32 property
X86CPU.hv-spinlocks is a uint32 property that has a special setter
validating the value to be no less than 0xFFF and no bigger than
UINT_MAX.  The latter check is redundant; as for the former, there
appears to be no reason to prohibit the user from setting it to a lower
value.

So nuke the dedicated getter/setter pair and convert 'hv-spinlocks' to a
regular uint32 property.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20190618110659.14744-1-rkagan@virtuozzo.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Eduardo Habkost fea306520e i386: Don't print warning if phys-bits was set automatically
If cpu->host_phys_bits_limit is set, QEMU will make
cpu->phys_bits be lower than host_phys_bits on some cases.  This
triggers a warning that was supposed to be printed only if
phys-bits was explicitly set in the command-line.

Reorder the code so the value of cpu->phys_bits is validated
before the cpu->host_phys_bits handling.  This will avoid
unexpected warnings when cpu->host_phys_bits_limit is set.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20190611205420.20286-1-ehabkost@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Like Xu d65af288a8 i386: Update new x86_apicid parsing rules with die_offset support
In new sockets/dies/cores/threads model, the apicid of logical cpu could
imply die level info of guest cpu topology thus x86_apicid_from_cpu_idx()
need to be refactored with #dies value, so does apicid_*_offset().

To keep semantic compatibility, the legacy pkg_offset which helps to
generate CPUIDs such as 0x3 for L3 cache should be mapping to die_offset.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190612084104.34984-5-like.xu@linux.intel.com>
[ehabkost: squash unit test patch]
Message-Id: <20190612084104.34984-6-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Like Xu 176d2cda0d i386/cpu: Consolidate die-id validity in smp context
The field die_id (default as 0) and has_die_id are introduced to X86CPU.
Following the legacy smp check rules, the die_id validity is added to
the same contexts as leagcy smp variables such as hmp_hotpluggable_cpus(),
machine_set_cpu_numa_node(), cpu_slot_to_string() and pc_cpu_pre_plug().

Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190612084104.34984-4-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Like Xu c26ae61081 i386: Add die-level cpu topology to x86CPU on PCMachine
The die-level as the first PC-specific cpu topology is added to the leagcy
cpu topology model, which has one die per package implicitly and only the
numbers of sockets/cores/threads are configurable.

In the new model with die-level support, the total number of logical
processors (including offline) on board will be calculated as:

     #cpus = #sockets * #dies * #cores * #threads

and considering compatibility, the default value for #dies would be
initialized to one in x86_cpu_initfn() and pc_machine_initfn().

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190612084104.34984-2-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Like Xu 0e11fc6955 hw/i386: Replace global smp variables with machine smp properties
The global smp variables in i386 are replaced with smp machine properties.
To avoid calling qdev_get_machine() as much as possible, some related funtions
for acpi data generations are refactored. No semantic changes.

A local variable of the same name would be introduced in the declaration
phase if it's used widely in the context OR replace it on the spot if it's
only used once. No semantic changes.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Message-Id: <20190518205428.90532-8-like.xu@linux.intel.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-07-05 17:08:03 -03:00
Markus Armbruster 7f7b4e7abe qapi: Split machine-target.json off target.json and misc.json
Move commands query-cpu-definitions, query-cpu-model-baseline,
query-cpu-model-comparison, and query-cpu-model-expansion with their
types from target.json to machine-target.json.  Also move types
CpuModelInfo, CpuModelExpansionType, and CpuModelCompareResult from
misc.json there.  Add machine-target.json to MAINTAINERS section
"Machine core".

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190619201050.19040-13-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
[Commit message typo fixed]
2019-07-02 13:37:00 +02:00
Markus Armbruster 8ac25c8442 qapi: Split machine.json off misc.json
Move commands cpu-add, query-cpus, query-cpus-fast,
query-current-machine, query-hotpluggable-cpus, query-machines,
query-memdev, and set-numa-node with their types from misc.json to new
machine.json.  Also move types X86CPURegister32 and
X86CPUFeatureWordInfo.  Add machine.json to MAINTAINERS section
"Machine core".

Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20190619201050.19040-9-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2019-07-02 13:37:00 +02:00
Liran Alon fd13f23b8c target/i386: kvm: Add support for KVM_CAP_EXCEPTION_PAYLOAD
Kernel commit c4f55198c7c2 ("kvm: x86: Introduce KVM_CAP_EXCEPTION_PAYLOAD")
introduced a new KVM capability which allows userspace to correctly
distinguish between pending and injected exceptions.

This distinguish is important in case of nested virtualization scenarios
because a L2 pending exception can still be intercepted by the L1 hypervisor
while a L2 injected exception cannot.

Furthermore, when an exception is attempted to be injected by QEMU,
QEMU should specify the exception payload (CR2 in case of #PF or
DR6 in case of #DB) instead of having the payload already delivered in
the respective vCPU register. Because in case exception is injected to
L2 guest and is intercepted by L1 hypervisor, then payload needs to be
reported to L1 intercept (VMExit handler) while still preserving
respective vCPU register unchanged.

This commit adds support for QEMU to properly utilise this new KVM
capability (KVM_CAP_EXCEPTION_PAYLOAD).

Reviewed-by: Nikita Leshenko <nikita.leshchenko@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-10-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21 13:25:27 +02:00
Liran Alon 18ab37ba1c target/i386: kvm: Block migration for vCPUs exposed with nested virtualization
Commit d98f26073b ("target/i386: kvm: add VMX migration blocker")
added a migration blocker for vCPU exposed with Intel VMX.
However, migration should also be blocked for vCPU exposed with
AMD SVM.

Both cases should be blocked because QEMU should extract additional
vCPU state from KVM that should be migrated as part of vCPU VMState.
E.g. Whether vCPU is running in guest-mode or host-mode.

Fixes: d98f26073b ("target/i386: kvm: add VMX migration blocker")
Reviewed-by: Maran Wilson <maran.wilson@oracle.com>
Signed-off-by: Liran Alon <liran.alon@oracle.com>
Message-Id: <20190619162140.133674-6-liran.alon@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21 13:23:44 +02:00
Xiaoyao Li 597360c0d8 target/i386: define a new MSR based feature word - FEAT_CORE_CAPABILITY
MSR IA32_CORE_CAPABILITY is a feature-enumerating MSR, which only
enumerates the feature split lock detection (via bit 5) by now.

The existence of MSR IA32_CORE_CAPABILITY is enumerated by CPUID.7_0:EDX[30].

The latest kernel patches about them can be found here:
https://lkml.org/lkml/2019/4/24/1909

Signed-off-by: Xiaoyao Li <xiaoyao.li@linux.intel.com>
Message-Id: <20190617153654.916-1-xiaoyao.li@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21 02:29:39 +02:00
Vitaly Kuznetsov 128531d9e1 i386/kvm: add support for Direct Mode for Hyper-V synthetic timers
Hyper-V on KVM can only use Synthetic timers with Direct Mode (opting for
an interrupt instead of VMBus message). This new capability is only
announced in KVM_GET_SUPPORTED_HV_CPUID.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20190517141924.19024-10-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21 02:29:39 +02:00
Vitaly Kuznetsov e48ddcc6ce i386/kvm: implement 'hv-passthrough' mode
In many case we just want to give Windows guests all currently supported
Hyper-V enlightenments and that's where this new mode may come handy. We
pass through what was returned by KVM_GET_SUPPORTED_HV_CPUID.

hv_cpuid_check_and_set() is modified to also set cpu->hyperv_* flags as
we may want to check them later (and we actually do for hv_runtime,
hv_synic,...).

'hv-passthrough' is a development only feature, a migration blocker is
added to prevent issues while migrating between hosts with different
feature sets.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20190517141924.19024-6-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-21 02:29:38 +02:00