Commit Graph

238 Commits

Author SHA1 Message Date
Daniel P. Berrange 1ce36bfe64 i386: expose "TCGTCGTCGTCG" in the 0x40000000 CPUID leaf
Currently when running KVM, we expose "KVMKVMKVM\0\0\0" in
the 0x40000000 CPUID leaf. Other hypervisors (VMWare,
HyperV, Xen, BHyve) all do the same thing, which leaves
TCG as the odd one out.

The CPUID signature is used by software to detect which
virtual environment they are running in and (potentially)
change behaviour in certain ways. For example, systemd
supports a ConditionVirtualization= setting in unit files.
The virt-what command can also report the virt type it is
running on

Currently both these apps have to resort to custom hacks
like looking for 'fw-cfg' entry in the /proc/device-tree
file to identify TCG.

This change thus proposes a signature "TCGTCGTCGTCG" to be
reported when running under TCG.

To hide this, the -cpu option tcg-cpuid=off can be used.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170509132736.10071-3-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-07-17 15:41:30 -03:00
Thomas Huth 2099935dbf Move CONFIG_KVM related definitions to kvm_i386.h
pc.h and sysemu/kvm.h are also included from common code (where
CONFIG_KVM is not available), so the #defines that depend on CONFIG_KVM
should not be declared here to avoid that anybody is using them in a
wrong way. Since we're also going to poison CONFIG_KVM for common code,
let's move them to kvm_i386.h instead. Most of the dummy definitions
from sysemu/kvm.h are also unused since the code that uses them is
only compiled for CONFIG_KVM (e.g. target/i386/kvm.c), so the unused
defines are also simply dropped here instead of being moved.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1498454578-18709-3-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-04 14:30:03 +02:00
Laszlo Ersek 2f295167e0 q35/mch: implement extended TSEG sizes
The q35 machine type currently lets the guest firmware select a 1MB, 2MB
or 8MB TSEG (basically, SMRAM) size. In edk2/OVMF, we use 8MB, but even
that is not enough when a lot of VCPUs (more than approx. 224) are
configured -- SMRAM footprint scales largely proportionally with VCPU
count.

Introduce a new property for "mch" called "extended-tseg-mbytes", which
expresses (in megabytes) the user's choice of TSEG (SMRAM) size.

Invent a new, QEMU-specific register in the config space of the DRAM
Controller, at offset 0x50, in order to allow guest firmware to query the
TSEG (SMRAM) size.

According to Intel Document Number 316966-002, Table 5-1 "DRAM Controller
Register Address Map (D0:F0)":

    Warning: Address locations that are not listed are considered Intel
             Reserved registers locations. Reads to Reserved registers may
             return non-zero values. Writes to reserved locations may
             cause system failures.

             All registers that are defined in the PCI 2.3 specification,
             but are not necessary or implemented in this component are
             simply not included in this document. The
             reserved/unimplemented space in the PCI configuration header
             space is not documented as such in this summary.

Offsets 0x50 and 0x51 are not listed in Table 5-1. They are also not part
of the standard PCI config space header. And they precede the capability
list as well, which starts at 0xe0 for this device.

When the guest writes value 0xffff to this register, the value that can be
read back is that of "mch.extended-tseg-mbytes" -- unless it remains
0xffff. The guest is required to write 0xffff first (as opposed to a
read-only register) because PCI config space is generally not cleared on
QEMU reset, and after S3 resume or reboot, new guest firmware running on
old QEMU could read a guest OS-injected value from this register.

After reading the available "extended" TSEG size, the guest firmware may
actually request that TSEG size by writing pattern 11b to the ESMRAMC
register's TSEG_SZ bit-field. (The Intel spec referenced above defines
only patterns 00b (1MB), 01b (2MB) and 10b (8MB); 11b is reserved.)

On the QEMU command line, the value can be set with

  -global mch.extended-tseg-mbytes=N

The default value for 2.10+ q35 machine types is 16. The value is limited
to 0xfff (4095) at the moment, purely so that the product (4095 MB) can be
stored to the uint32_t variable "tseg_size" in mch_update_smram(). Users
are responsible for choosing sensible TSEG sizes.

On 2.9 and earlier q35 machine types, the default value is 0. This lets
the 11b bit pattern in ESMRAMC.TSEG_SZ, and the register at offset 0x50,
keep their original behavior.

When "extended-tseg-mbytes" is nonzero, the new register at offset 0x50 is
set to that value on reset, for completeness.

PCI config space is migrated automatically, so no VMSD changes are
necessary.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1447027
Ref: https://lists.01.org/pipermail/edk2-devel/2017-May/010456.html
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-06-16 18:07:08 +03:00
Eduardo Habkost 1f43571604 pc: Use "min-[x]level" on compat_props
Since the automatic cpuid-level code was introduced in commit
c39c0edf9b ("target-i386: Automatically
set level/xlevel/xlevel2 when needed"), the CPU model tables just define
the default CPUID level code (set using "min-level").  Setting
"[x]level" forces CPUID level to a specific value and disable the
automatic-level logic.

But the PC compat code was not updated and the existing "[x]level"
compat properties broke compatibility for people using features that
triggered the auto-level code.  To keep previous behavior, we should set
"min-[x]level" instead of "[x]level" on compat_props.

This was not a problem for most cases, because old machine-types don't
have full-cpuid-auto-level enabled.  The only common use case it broke
was the CPUID[7] auto-level code, that was already enabled since the
first CPUID[7] feature was introduced (in QEMU 1.4.0).

This causes the regression reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1454641

Change the PC compat code to use "min-[x]level" instead of "[x]level" on
compat_props, and add new test cases to ensure we don't break this
again.

Reported-by: "Guo, Zhiyi" <zhguo@redhat.com>
Fixes: c39c0edf9b ("target-i386: Automatically set level/xlevel/xlevel2 when needed")
Cc: qemu-stable@nongnu.org
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-06-05 14:59:08 -03:00
Peter Xu 465238d9f8 pc: add 2.10 machine type
CC: "Michael S. Tsirkin" <mst@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-10 22:04:23 +03:00
Igor Mammedov 98e753a6e5 pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot
Since 2.7 commit (b2a575a Add optionrom compatible with fw_cfg DMA version)
regressed migration during firmware exection time by
abusing fwcfg.dma_enabled property to decide loading
dma version of option rom AND by mistake disabling DMA
for 2.6 and earlier globally instead of only for option rom.

so 2.6 machine type guest is broken when it already runs
firmware in DMA mode but migrated to qemu-2.7(pc-2.6)
at that time;

a) qemu-2.6:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=ioport)
b) qemu-2.7:pc2.6 (fwcfg.dma=off,firmware=ioport,oprom=ioport)

  to:   a     b
from
a       OK   FAIL
b       OK   OK

So we currently have broken forward migration from
qemu-2.6 to qemu-2.[789] that however could be fixed
for 2.10 by re-enabling DMA for 2.[56] machine types
and allowing dma capable option rom only since 2.7.
As result qemu should end up with:

c) qemu-2.10:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=ioport)

   to:  a     b    c
from
a      OK   FAIL  OK
b      OK   OK    OK
c      OK   FAIL  OK

where forward migration from qemu-2.6 to qemu-2.10 should
work again leaving only qemu-2.[789]:pc-2.6 broken.

Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Analyzed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-05-10 22:04:23 +03:00
Phil Dennis-Jordan 6103451aeb hw/i386: Build-time assertion on pc/q35 reset register being identical.
This adds a clarifying comment and build time assert to the FADT reset register field initialisation: the reset register is the same on both machine types.

Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-Id: <1489558827-28971-3-git-send-email-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-05-03 12:29:40 +02:00
Eduardo Habkost ec56a4a7b0 i386: Change stepping of Haswell to non-blacklisted value
glibc blacklists TSX on Haswell CPUs with model==60 and
stepping < 4. To make the Haswell CPU model more useful, make
those guests actually use TSX by changing CPU stepping to 4.

References:
* glibc commit 2702856bf45c82cf8e69f2064f5aa15c0ceb6359
  https://sourceware.org/git/?p=glibc.git;a=commit;h=2702856bf45c82cf8e69f2064f5aa15c0ceb6359

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170309181212.18864-4-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-10 15:01:09 -03:00
Dr. David Alan Gilbert fc3a1fd74f x86: Work around SMI migration breakages
Migration from a 2.3.0 qemu results in a reboot on the receiving QEMU
due to a disagreement about SM (System management) interrupts.

2.3.0 didn't have much SMI support, but it did set CPU_INTERRUPT_SMI
and this gets into the migration stream, but on 2.3.0 it
never got delivered.

~2.4.0 SMI interrupt support was added but was broken - so
that when a 2.3.0 stream was received it cleared the CPU_INTERRUPT_SMI
but never actually caused an interrupt.

The SMI delivery was recently fixed by 68c6efe07a, but the
effect now is that an incoming 2.3.0 stream takes the interrupt it
had flagged but it's bios can't actually handle it(I think
partly due to the original interrupt not being taken during boot?).
The consequence is a triple(?) fault and a reboot.

Tested from:
  2.3.1 -M 2.3.0
  2.7.0 -M 2.3.0
  2.8.0 -M 2.3.0
  2.8.0 -M 2.8.0

This corresponds to RH bugzilla entry 1420679.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170223133441.16010-1-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:03 +01:00
Igor Mammedov 38690a1ca7 machine: move possible_cpus to MachineState
so that it would be possible to reuse it with
spapr/virt-aarch64 targets.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-02-22 11:28:28 +11:00
Marc-André Lureau 0ec7b3e7f2 char: rename CharDriverState Chardev
Pick a uniform chardev type name.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27 18:07:59 +01:00
Phil Dennis-Jordan 0b564e6f53 pc: Enable vmware-cpuid-freq CPU option for 2.9+ machine types
Signed-off-by: Phil Dennis-Jordan <phil@philjordan.eu>
Message-Id: <1484921496-11257-4-git-send-email-phil@philjordan.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27 18:07:58 +01:00
Laszlo Ersek b8bab8eb69 hw/isa/lpc_ich9: negotiate SMI broadcast on pc-q35-2.9+ machine types
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20170126014416.11211-4-lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-27 18:07:31 +01:00
Igor Mammedov 80e5db303d machine: Make possible_cpu_arch_ids() return const pointer
make sure that external callers won't try to modify
possible_cpus and owner of possible_cpus can access
it directly when it modifies it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1484759609-264075-5-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-01-23 21:25:37 -02:00
Marcelo Tosatti abc62c89f3 pc.h: move x-mach-use-reliable-get-clock compat entry to PC_COMPAT_2_8
As noticed by David Gilbert, commit 6053a86 'kvmclock: reduce kvmclock
differences on migration' added 'x-mach-use-reliable-get-clock' and a
compatibility entry that turns it off; however it got merged after 2.8.0
was released but the entry has gone into PC_COMPAT_2_7 where it should
have gone into PC_COMPAT_2_8.

Fix it by moving the entry to PC_COMPAT_2_8.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170118175343.GA26873@amt.cnet>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-01-20 13:22:57 +01:00
Marcelo Tosatti 6053a86fe7 kvmclock: reduce kvmclock difference on migration
Check for KVM_CAP_ADJUST_CLOCK capability KVM_CLOCK_TSC_STABLE, which
indicates that KVM_GET_CLOCK returns a value as seen by the guest at
that moment.

For new machine types, use this value rather than reading
from guest memory.

This reduces kvmclock difference on migration from 5s to 0.1s
(when max_downtime == 5s).

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20161121105052.598267440@redhat.com>
[Add comment explaining what is going on. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:56 +01:00
Chao Peng feddd2fd91 pc: make pit configurable
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Message-Id: <1478330391-74060-4-git-send-email-chao.p.peng@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:25 +01:00
Chao Peng 272f042877 pc: make sata configurable
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Message-Id: <1478330391-74060-3-git-send-email-chao.p.peng@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:25 +01:00
Chao Peng be232eb076 pc: make smbus configurable
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Message-Id: <1478330391-74060-2-git-send-email-chao.p.peng@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-12-22 16:00:25 +01:00
Dr. David Alan Gilbert f9f885b78a migration/pcspk: Turn migration of pcspk off for 2.7 and older
To keep backwards migration compatibility allow us to turn pcspk
migration off.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20161128133201.16104-3-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-11-28 16:45:12 +01:00
Igor Mammedov e3cadac073 pc: fix FW_CFG_NB_CPUS to account for -device added CPUs
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1479301481-197333-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-11-16 12:10:00 -02:00
Igor Mammedov eabff15820 Revert "pc: Add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUs"
This reverts commit 080ac219cc.

Legacy FW_CFG_NB_CPUS will be reused instead of 'etc/boot-cpus'
fw_cfg file since it does the same and there is no point
to maintaing duplicate guest ABI, if it can be helped.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1479212236-183810-2-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-11-16 12:09:53 -02:00
Wei Liu 021746c131 PCMachineState: introduce acpi_build_enabled field
Introduce this field to control whether ACPI build is enabled by a
particular machine or accelerator.

It defaults to true if the machine itself supports ACPI build. Xen
accelerator will disable it because Xen is in charge of building ACPI
tables for the guest.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
2016-11-02 12:26:12 -07:00
Anand J 814bb12a56 clean-up: removed duplicate #includes
Some files contain multiple #includes of the same header file.
Removed most of those unnecessary duplicate entries using
scripts/clean-includes.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Anand J <anand.indukala@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-10-28 18:17:24 +03:00
Igor Mammedov 080ac219cc pc: Add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUs
Currently firmware uses 1 byte at 0x5F offset in RTC CMOS
to get number of CPUs present at boot. However 1 byte is
not enough to handle more than 255 CPUs.  So add a new
fw_cfg file that would allow QEMU to tell it.
For compat reasons add file only for machine types that
support more than 255 CPUs.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:15 -02:00
Peter Maydell 86e121ae75 * Thread Sanitizer fixes (Alex)
* Coverity fixes (David)
 * test-qht fixes (Emilio)
 * QOM interface for info irq/info pic (Hervé)
 * -rtc clock=rt fix (Junlian)
 * mux chardev fixes (Marc-André)
 * nicer report on death by signal (Michal)
 * qemu-tech TLC (Paolo)
 * MSI support for edu device (Peter)
 * qemu-nbd --offset fix (Tomáš)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJX98xmFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 IXsH/idLNlBzbrGhcuZOXEAd4fCyCyhXGMuOAGJXLHgv+EfiqrJ9z4HTn44czdh7
 rJuQDYeDrfl36zc0n8weY7JSEsorCq+JBDomFUFodmCrFUIue2jXYOK6pt5LUrQM
 OTyruQMKHD316SnJFOK8Tkxi5DrAHNRs+ynDcm+IoB65KE9YgBcBWuEJ03mF9cHi
 5sb/SBEqfL49gVlnFXBDTRgXXwA5axS7xKd4+7CWtbVFvJxurImjywGqKI5G/dmC
 TJyP+Dty4iNjFP1E0VvfL6ETovncZlfe4Hx1b971pll/ec88jGL0brqQMPjACrWh
 TyLXLN9oTbEKuDxx1Nh23xRFh+c=
 =sgtZ
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Thread Sanitizer fixes (Alex)
* Coverity fixes (David)
* test-qht fixes (Emilio)
* QOM interface for info irq/info pic (Hervé)
* -rtc clock=rt fix (Junlian)
* mux chardev fixes (Marc-André)
* nicer report on death by signal (Michal)
* qemu-tech TLC (Paolo)
* MSI support for edu device (Peter)
* qemu-nbd --offset fix (Tomáš)

# gpg: Signature made Fri 07 Oct 2016 17:25:10 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (39 commits)
  qemu-doc: merge qemu-tech and qemu-doc
  qemu-tech: rewrite some parts
  qemu-tech: reorganize content
  qemu-tech: move TCG test documentation to tests/tcg/README
  qemu-tech: move user mode emulation features from qemu-tech
  qemu-tech: document lazy condition code evaluation in cpu.h
  qemu-tech: move text from qemu-tech to tcg/README
  qemu-doc: drop installation and compilation notes
  qemu-doc: replace introduction with the one from the internals manual
  qemu-tech: drop index
  test-qht: perform lookups under rcu_read_lock
  qht: fix unlock-after-free segfault upon resizing
  qht: simplify qht_reset_size
  qemu-nbd: Shrink image size by specified offset
  qemu_kill_report: Report PID name too
  util: Introduce qemu_get_pid_name
  char: update read handler in all cases
  char: use a fixed idx for child muxed chr
  i8259: give ISA device when registering ISA ioports
  .travis.yml: add gcc sanitizer build
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-10 10:39:29 +01:00
Hervé Poussineau 61b97833b3 intc: make HMP 'info irq' and 'info pic' commands use InterruptStatsProvider interface
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <1474921408-24710-6-git-send-email-hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-10-04 10:00:25 +02:00
Evgeny Yakovlev 339892d758 target-i386: Correct family/model/stepping for Opteron_G3
Current CPU definition for AMD Opteron third generation includes
features like SSE4a and LAHF_LM support in emulated CPUID. These
features are present in K8 rev.E or K10 CPUs and later. However,
current G3 family and model describe 2nd generation K8 cores instead.

This is incorrect but was considered harmless until our tests found a
problem with linux kernels >= 3.10 (and maybe earlier) which specifically
check for Opteron K8 model when parsing CPUID leaf 0x80000001:
http://lxr.free-electrons.com/source/arch/x86/kernel/cpu/amd.c?v=3.16#L552
This code will disable LAHF_LM feature in /proc/cpuinfo if model number
is inconsistent.

This change sets Opteron_G3 family/model/stepping to 16/2/3 which is
a proper Opteron 3rd generation 2350 CPU.

Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-03 16:06:43 -03:00
Eduardo Habkost c39c0edf9b target-i386: Automatically set level/xlevel/xlevel2 when needed
Instead of requiring users and management software to be aware of
required CPUID level/xlevel/xlevel2 values for each feature,
automatically increase those values when features need them.

This was already done for CPUID[7].EBX, and is now made generic
for all CPUID feature flags. Unit test included, to make sure we
don't break ABI on older machine-types and don't mess with the
CPUID level values if they are explicitly set by the user.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27 16:17:17 -03:00
Igor Mammedov 152fcbecad target-i386: turn off CPU.l3-cache only for 2.7 and older machine types
commit (14c985cff target-i386: present virtual L3 cache info for vcpus)
misplaced compat property putting it in new 2.8 machine type
which would effectively to disable feature until 2.9 is released.
Intent of commit probably should be to disable feature for 2.7
and older while allowing not yet released 2.8 to have feature
enabled by default.

Cc: qemu-stable@nongnu.org
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-23 18:51:40 +03:00
Igor Mammedov 2e0910329b pc: clean up COMPAT macro chaining
Since commit
 bacc344c ("machine: add properties to compat_props incrementaly")
there is no need to chain per machine type compat macro.

Clean up places where it was done anyway so it will be
consistent and won't confuse contributors during addtion
of new machine types.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-23 18:51:40 +03:00
Ladi Prosek d4b84d564e Remove unused function declarations
Unused function declarations were found using a simple gcc plugin and
manually verified by grepping the sources.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-09-15 15:32:22 +03:00
Longpeng(Mike) 14c985cffa target-i386: present virtual L3 cache info for vcpus
Some software algorithms are based on the hardware's cache info, for example,
for x86 linux kernel, when cpu1 want to wakeup a task on cpu2, cpu1 will trigger
a resched IPI and told cpu2 to do the wakeup if they don't share low level
cache. Oppositely, cpu1 will access cpu2's runqueue directly if they share llc.
The relevant linux-kernel code as bellow:

	static void ttwu_queue(struct task_struct *p, int cpu)
	{
		struct rq *rq = cpu_rq(cpu);
		......
		if (... && !cpus_share_cache(smp_processor_id(), cpu)) {
			......
			ttwu_queue_remote(p, cpu); /* will trigger RES IPI */
			return;
		}
		......
		ttwu_do_activate(rq, p, 0); /* access target's rq directly */
		......
	}

In real hardware, the cpus on the same socket share L3 cache, so one won't
trigger a resched IPIs when wakeup a task on others. But QEMU doesn't present a
virtual L3 cache info for VM, then the linux guest will trigger lots of RES IPIs
under some workloads even if the virtual cpus belongs to the same virtual socket.

For KVM, there will be lots of vmexit due to guest send IPIs.
The workload is a SAP HANA's testsuite, we run it one round(about 40 minuates)
and observe the (Suse11sp3)Guest's amounts of RES IPIs which triggering during
the period:
        No-L3           With-L3(applied this patch)
cpu0:	363890		44582
cpu1:	373405		43109
cpu2:	340783		43797
cpu3:	333854		43409
cpu4:	327170		40038
cpu5:	325491		39922
cpu6:	319129		42391
cpu7:	306480		41035
cpu8:	161139		32188
cpu9:	164649		31024
cpu10:	149823		30398
cpu11:	149823		32455
cpu12:	164830		35143
cpu13:	172269		35805
cpu14:	179979		33898
cpu15:	194505		32754
avg:	268963.6	40129.8

The VM's topology is "1*socket 8*cores 2*threads".
After present virtual L3 cache info for VM, the amounts of RES IPIs in guest
reduce 85%.

For KVM, vcpus send IPIs will cause vmexit which is expensive, so it can cause
severe performance degradation. We had tested the overall system performance if
vcpus actually run on sparate physical socket. With L3 cache, the performance
improves 7.2%~33.1%(avg:15.7%).

Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09 20:58:34 +03:00
Longpeng(Mike) a4d3c83476 pc: Add 2.8 machine
This will used by the next patch.

Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-09-09 20:58:34 +03:00
Marc-André Lureau 3e6c0c4c2c pc: keep gsi reference
Further cleanup would need to call qemu_free_irq() at the appropriate
time, but for now this silences ASAN about direct leaks.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-09-08 18:05:21 +04:00
Marc-André Lureau 8ea753718b machine: use class base init generated name
machine_class_base_init() member name is allocated by
machine_class_base_init(), but not freed by
machine_class_finalize().  Simply freeing there doesn't work,
because DEFINE_PC_MACHINE() overwrites it with a literal string.

Fix DEFINE_PC_MACHINE() not to overwrite it, and add the missing
free to machine_class_finalize().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-09-08 18:05:21 +04:00
Marc-André Lureau d80fe99de4 pc: simplify passing qemu_irq
qemu_irq is already a pointer, no need to have an extra pointer level.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-09-08 18:05:21 +04:00
Igor Mammedov 7298d4fd51 apic: fix broken migration for kvm-apic
commit f6e98444 (apic: Use apic_id as apic's migration instance_id)
breaks migration when in kernel irqchip is used for 2.6 and older
machine types.

It applies compat property only for userspace 'apic' type
instead of applying it to all apic types inherited from
'apic-common' type as it was supposed to do.

Fix it by setting compat property 'legacy-instance-id' for
'apic-common' type which affects inherited types (i.e. not
only 'apic' but also 'kvm-apic' types)

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1469800542-11402-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-08-03 18:44:57 +02:00
Peter Maydell 206d0c2436 pc, pci, virtio: new features, cleanups, fixes
- interrupt remapping for intel iommus
 - a bunch of virtio cleanups
 - fixes all over the place
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXkQsqAAoJECgfDbjSjVRpanoIAJ9JVlc1aEjt9sa0cSBcs+NQ
 J7JmgU9FqFsj+4FrNTouO3AxTjHurd1UAULP1WMPD+V3JpbnHct8r6SCBLQ5EBMN
 VOjYo4DwWs1g+DqnQ9WZmbadu06XvYi/yiAKNUzWfZk0MR11D0D/S5hmarNKw0Kq
 tGHeTWjGeY4WqFLV7m+qB4+cqkAByn6um99UtUvgLL05RgIEIP2IEMKYZ+rXvAa9
 iGUvzqlO7mbq/+LbL18kaWywa4TCwbbd2eSGWaqhX4CuB62Rl33mWTXFcfaYhkyp
 Z3FgwaJ09h0lAjSVEbyAuLFMfO/BnMcsoKqwl4xc4vkn/xBCqFtgH9JcEVm3O8U=
 =ge2D
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc, pci, virtio: new features, cleanups, fixes

- interrupt remapping for intel iommus
- a bunch of virtio cleanups
- fixes all over the place

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 21 Jul 2016 18:49:30 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (57 commits)
  intel_iommu: avoid unnamed fields
  virtio: Update migration docs
  virtio-gpu: Wrap in vmstate
  virtio-gpu: Use migrate_add_blocker for virgl migration blocking
  virtio-input: Wrap in vmstate
  9pfs: Wrap in vmstate
  virtio-serial: Wrap in vmstate
  virtio-net: Wrap in vmstate
  virtio-balloon: Wrap in vmstate
  virtio-rng: Wrap in vmstate
  virtio-blk: Wrap in vmstate
  virtio-scsi: Wrap in vmstate
  virtio: Migration helper function and macro
  virtio-serial: Remove old migration version support
  virtio-net: Remove old migration version support
  virtio-scsi: Replace HandleOutput typedef
  Revert "mirror: Workaround for unexpected iohandler events during completion"
  virtio-scsi: Call virtio_add_queue_aio
  virtio-blk: Call virtio_add_queue_aio
  virtio: Introduce virtio_add_queue_aio
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-21 20:12:37 +01:00
Peter Xu cb135f59b8 q35: ioapic: add support for emulated IOAPIC IR
This patch translates all IOAPIC interrupts into MSI ones. One pseudo
ioapic address space is added to transfer the MSI message. By default,
it will be system memory address space. When IR is enabled, it will be
IOMMU address space.

Currently, only emulated IOAPIC is supported.

Idea suggested by Jan Kiszka and Rita Sinha in the following patch:

https://lists.gnu.org/archive/html/qemu-devel/2016-03/msg01933.html

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-21 20:43:49 +03:00
Igor Mammedov f6e984443f apic: Use apic_id as apic's migration instance_id
instance_id is generated by last_used_id + 1 for a given device type
so for QEMU with 3 CPUs instance_id for APICs is a seti of [0, 1, 2]
When CPU in the middle is hot-removed and migration started
APICs with instance_ids 0 and 2 are transferred in migration stream.
However target starts with 2 CPUs and APICs' instance_ids are
generated from scratch [0, 1] hence migration fails with error
  Unknown savevm section or instance 'apic' 2

Fix issue by manually registering APIC's vmsd with apic_id as
instance_id, in this case instance_id on target will always
match instance_id on source as apic_id is the same for a given
cpu instance.

Reported-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-20 12:02:19 -03:00
Dr. David Alan Gilbert fcc35e7cca target-i386: Fill high bits of mtrr mask
Fill the bits between 51..number-of-physical-address-bits in the
MTRR_PHYSMASKn variable range mtrr masks so that they're consistent
in the migration stream irrespective of the physical address space
of the source VM in a migration.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-07-20 11:58:44 -03:00
Marc Marí b2a575a1c6 Add optionrom compatible with fw_cfg DMA version
This optionrom is based on linuxboot.S.

Signed-off-by: Marc Marí <markmb@redhat.com>
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <1464027093-24073-2-git-send-email-rjones@redhat.com>
[Add -fno-toplevel-reorder, support clang without -m16. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-07-14 15:50:52 +02:00
Peter Maydell 791b7d2340 pc, pci, virtio: new features, cleanups, fixes
iommus can not be added with -device.
 cleanups and fixes all over the place
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXe4l4AAoJECgfDbjSjVRpIz4IALye7mKG61/POA4Gqmhalc3d
 HnlNSZ2YcKAuvPg7WWkBuRacrQvVY/MbW1mLloG1lY0tdFgZG8Cy+CY6wJg1NE4c
 cXd+77vHkIyrnl+Nil+QOgTFiAsMnD+mXHHsnCDw2jGn3JbgVNuCMi7V34fGkQd2
 PDkZyYfwTqO3HytuG0/j2Somc9du1gjYdn+9qigfZVgP96jGDojBuJWuuU5flCB3
 Kj5xrOuI01XlbdTk71tVjBJBektQurWr6r7GECDqZIpUfc+BI70FU9jPh+OlLTO/
 92yi29ncjyStz4tRnf18xoQ8uBgH/tD1xigEUPRtnm1+0i/tgONBL8cAdBF9FBE=
 =ABGE
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc, pci, virtio: new features, cleanups, fixes

iommus can not be added with -device.
cleanups and fixes all over the place

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 05 Jul 2016 11:18:32 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (30 commits)
  vmw_pvscsi: remove unnecessary internal msi state flag
  e1000e: remove unnecessary internal msi state flag
  vmxnet3: remove unnecessary internal msi state flag
  mptsas: remove unnecessary internal msi state flag
  megasas: remove unnecessary megasas_use_msi()
  pci: Convert msi_init() to Error and fix callers to check it
  pci bridge dev: change msi property type
  megasas: change msi/msix property type
  mptsas: change msi property type
  intel-hda: change msi property type
  usb xhci: change msi/msix property type
  change pvscsi_init_msi() type to void
  tests: add APIC.cphp and DSDT.cphp blobs
  tests: acpi: add CPU hotplug testcase
  log: Permit -dfilter 0..0xffffffffffffffff
  range: Replace internal representation of Range
  range: Eliminate direct Range member access
  log: Clean up misuse of Range for -dfilter
  pci_register_bar: cleanup
  Revert "virtio-net: unbreak self announcement and guest offloads after migration"
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-05 16:48:24 +01:00
Peter Maydell 60a0f1af07 ipxe: update submodule from 4e03af8ec to 041863191
e1000e+vmxnet3: add boot rom
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJXegFqAAoJEEy22O7T6HE4BasP/A3kDhgn6J3dqJN85eVw9jcb
 52zxttQuDJBibVKZaSIRiZAnBCTEn5WbUvZ5ZRTt5MVH3Whhv09ZhLfdWUJho9az
 Dnebv3wqyHg3SlAWFNBE3CjSkHswwIAhTrERqjJqXL2jU2erPAHIdTzprx1LdGc7
 hKxNy9tPHjyEeGWLUWrTCIC/YcTNkIp1gdSNcVIQ8TbEHIqo0vSK1Jyl429GYVYD
 bZO6VbX5D/QAEHs6a+Twf5lFnLZ4hlplm+9SbfjsfaVocx/+D5VtMD0AgYukR60M
 sHNKqzRimHsLQoz1PV4z9+ZexRDdge4y0eNlVHM8XdXYq0AHybv2/MYW6DHBAXaE
 nayEPPrwNnOVnE95Xgn03CtpddbQ2hAUXObm8gm6aAJEpXh8SeL7YLAYrYJtF5Oq
 cpB0ks0zDZZRPAr/c0gy075IbB3ITGUcphFid9lpn5ork/YYD/y5//GVB22IrOvQ
 qT0IXtmL7OA71i7XOvxYejdTWcArpzgWnOXmh7eVjumxDicC+B3JIORZutKaXHDu
 w9LFnz5l1DyIM0nLEJb2onP4lXRlYdyeiSHjQi4HH6fLEqZERfKlWvftRiE7agGq
 nt6QaFFeSXqaaDOjfmPFLF2RmIgTHzit279WcWGUQXkbWKvQSCgRB26LBGZ6kFB2
 Z1lTOF1b2zUjpKEsuFH2
 =zaj0
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kraxel/tags/pull-ipxe-20160704-1' into staging

ipxe: update submodule from 4e03af8ec to 041863191
e1000e+vmxnet3: add boot rom

# gpg: Signature made Mon 04 Jul 2016 07:25:46 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-ipxe-20160704-1:
  build: add pc-bios to config-host.mak deps
  ipxe: add new roms to BLOBS
  ipxe: update prebuilt binaries
  vmxnet3: add boot rom
  e1000e: add boot rom
  ipxe: add vmxnet3 rom
  ipxe: add e1000e rom
  ipxe: update submodule from 4e03af8ec to 041863191

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-05 12:46:18 +01:00
Markus Armbruster 01c9742d9d pc: Eliminate PcPciInfo
PcPciInfo has two (ill-named) members: Range w32 is the PCI hole, and
w64 is the PCI64 hole.

Three users:

* I440FXState and MCHPCIState have a member PcPciInfo pci_info, but
  only pci_info.w32 is actually used.  This is confusing.  Replace by
  Range pci_hole.

* acpi_build() uses auto PcPciInfo pci_info to forward both PCI holes
  from acpi_get_pci_info() to build_dsdt().  Replace by two variables
  Range pci_hole, pci_hole64.  Rename acpi_get_pci_info() to
  acpi_get_pci_holes().

PcPciInfo is now unused; drop it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-07-04 14:52:10 +03:00
Efimov Vasily d812b3d68d port92: handle A20 IRQ as GPIO
The port92 device has outgouing IRQ line A20. Currently the IRQ is referenced
by a pointer which normally is set during machine initialization. The
pointer is never changed at runtime. Hence, common GPIO model can be applied
to A20 IRQ line. Note that checking for IRQ to be connected as in
previous version of code is not required qemu_set_irq will do it.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily 3115b9e2d2 pckbd: handle A20 IRQ as GPIO
The i8042 device has outgouing IRQ line A20. Currently the IRQ is referenced
by a pointer which normally is set during machine initialization. The pointer
is never changed at runtime. So common GPIO model can be applied to A20 IRQ
line. Note that checking for IRQ to be connected as in previous version
of code is not required because qemu_set_irq will do it.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily 401f2f3ef1 Q35: implement property interfece to several parameters
During creation of Q35 instance several parameters are set using direct access.
It violates Qemu device model. Correctly, the parameters should be handled as
object properties.

The patch adds four link type properties for fields:
mch.ram_memory
mch.pci_address_space
mch.system_memory
mch.address_space_io
And, it adds two size type properties for fields:
mch.below_4g_mem_size
mch.above_4g_mem_size

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily 936a6447c8 vmport: identify vmport type by macro TYPE_VMPORT
Currently vmport device is identified by the string literal. Using a
preprocessor alias instead is preferable.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:45 +02:00