Commit Graph

1934 Commits

Author SHA1 Message Date
Chenyi Qiang
035d1ef265 i386: Add ratelimit for bus locks acquired in guest
A bus lock is acquired through either split locked access to writeback
(WB) memory or any locked access to non-WB memory. It is typically >1000
cycles slower than an atomic operation within a cache and can also
disrupts performance on other cores.

Virtual Machines can exploit bus locks to degrade the performance of
system. To address this kind of performance DOS attack coming from the
VMs, bus lock VM exit is introduced in KVM and it can report the bus
locks detected in guest. If enabled in KVM, it would exit to the
userspace to let the user enforce throttling policies once bus locks
acquired in VMs.

The availability of bus lock VM exit can be detected through the
KVM_CAP_X86_BUS_LOCK_EXIT. The returned bitmap contains the potential
policies supported by KVM. The field KVM_BUS_LOCK_DETECTION_EXIT in
bitmap is the only supported strategy at present. It indicates that KVM
will exit to userspace to handle the bus locks.

This patch adds a ratelimit on the bus locks acquired in guest as a
mitigation policy.

Introduce a new field "bus_lock_ratelimit" to record the limited speed
of bus locks in the target VM. The user can specify it through the
"bus-lock-ratelimit" as a machine property. In current implementation,
the default value of the speed is 0 per second, which means no
restrictions on the bus locks.

As for ratelimit on detected bus locks, simply set the ratelimit
interval to 1s and restrict the quota of bus lock occurence to the value
of "bus_lock_ratelimit". A potential alternative is to introduce the
time slice as a property which can help the user achieve more precise
control.

The detail of bus lock VM exit can be found in spec:
https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210521043820.29678-1-chenyi.qiang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-06-17 14:11:06 -04:00
Stefan Berger
11fb99e6f4 i386: Eliminate all TPM related code if CONFIG_TPM is not set
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210614191335.1968807-2-stefanb@linux.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-06-15 10:54:46 -04:00
Philippe Mathieu-Daudé
585190902a misc: Correct relative include path
Headers should be included from the 'include/' directory,
not from the root directory.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210516205034.694788-1-f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-06-05 21:10:42 +02:00
Dmitry Voronetskiy
d84451d38e i386/kvm: The value passed to strerror should be positive
Signed-off-by: Dmitry Voronetskiy <vda1999@yandex.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210519113528.12474-1-davoronetskiy@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-06-05 21:01:17 +02:00
Peter Maydell
8c345b3e6a * Update the references to some doc files (use *.rst instead of *.txt)
* Bump minimum versions of some requirements after removing CentOS 7 support
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmC3L1IRHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbU8wA//aRAIn0Qa3xoZdsT434P99GHauCZ5ePq3
 meY4co69c+TfkQZ/b0xlyvT3+7bd9Ni92CQL7n/LtX5bs3pIhRjHNIsm50e4x6Da
 0jb422cKWfRIltXlCfdL/dnFRtJluH83M0seRGMvmhveuWPZ19oSIasEiQjeg//e
 KMNK5z4HEnJ1czb3Bf8p38bmmY/O/QEAA5wAqk7iJkJHz6T/GlqImLYYwpFPlHj6
 JQttm0aWsHrsqEXxnuV0/DT1yHyXDB6S4iuAvABZWhv/M/nCaXo0ib0gW5NPtRPo
 Yf3HO163F9/fewJCc1AUsBe1C/2UwmSWRhEtxpr9uuW2Mv9qEl3hkJwd4k6sEOvh
 U4i+GONC4eElPcmECKUfHA9EP+7faDs6xnM6Ev/PIEp+cPJ2QRfklZv4qpMUWYtb
 3KkADchOyVZAsdB8cGmnznDEVmno1Dt0adVUq8CF6uW6MwD3pb4838arrfwIfwOp
 g4yTI1AQQykkxxOaR7GSNoxWRti7TH4fzLhx/xXDd9TKIOWuOepiyuhB7+Q48WAJ
 6EW/JUIzOe7k3GsI0iBsk+y67ED2tpATiRKWWw4QS1BwIhNEMUqgZkucFfKE6Ze1
 B1Xw+Di+1CzrLHV6lCjtdLVmqWqcGcO43HqhSK4sodE87BDGGwy9UxE6oBVt2C1x
 HPQ6/omDlFk=
 =RAkI
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/thuth-gitlab/tags/pull-request-2021-06-02' into staging

* Update the references to some doc files (use *.rst instead of *.txt)
* Bump minimum versions of some requirements after removing CentOS 7 support

# gpg: Signature made Wed 02 Jun 2021 08:12:18 BST
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* remotes/thuth-gitlab/tags/pull-request-2021-06-02:
  configure: bump min required CLang to 6.0 / XCode 10.0
  configure: bump min required GCC to 7.5.0
  configure: bump min required glib version to 2.56
  tests/docker: drop CentOS 7 container
  tests/vm: convert centos VM recipe to CentOS 8
  crypto: drop used conditional check
  crypto: bump min gnutls to 3.5.18, dropping RHEL-7 support
  crypto: bump min gcrypt to 1.8.0, dropping RHEL-7 support
  crypto: drop back compatibility typedefs for nettle
  crypto: bump min nettle to 3.4, dropping RHEL-7 support
  patchew: move quick build job from CentOS 7 to CentOS 8 container
  block/ssh: Bump minimum libssh version to 0.8.7
  docs: fix references to docs/devel/s390-dasd-ipl.rst
  docs: fix references to docs/specs/tpm.rst
  docs: fix references to docs/devel/build-system.rst
  docs: fix references to docs/devel/atomics.rst
  docs: fix references to docs/devel/tracing.rst

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-02 17:08:11 +01:00
Stefano Garzarella
d0fb9657a3 docs: fix references to docs/devel/tracing.rst
Commit e50caf4a5c ("tracing: convert documentation to rST")
converted docs/devel/tracing.txt to docs/devel/tracing.rst.

We still have several references to the old file, so let's fix them
with the following command:

  sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt)

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210517151702.109066-2-sgarzare@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-06-02 06:51:09 +02:00
Daniel P. Berrangé
b7c290177c i386: use better matching family/model/stepping for 'qemu64' CPU
The 'qemu64' CPUID currently reports a family/model/stepping that
approximately corresponds to an AMD K7 vintage architecture.
The K7 series predates the introduction of 64-bit support by AMD
in the K8 series. This has been reported to lead to LLVM complaints
about generating 64-bit code for a 32-bit CPU target

  LLVM ERROR: 64-bit code requested on a subtarget that doesn't support it!

It appears LLVM looks at the family/model/stepping, despite qemu64
reporting it is 64-bit capable.

This patch changes 'qemu64' to report a CPUID with the family, model
and stepping taken from a

 AMD Athlon(tm) 64 X2 Dual Core Processor 4000+

which is one of the first 64-bit AMD CPUs.

Closes https://gitlab.com/qemu-project/qemu/-/issues/191

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20210507133650.645526-2-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-05-31 15:53:03 -04:00
Philippe Mathieu-Daudé
cfa1f4bcee hw/mem/nvdimm: Use Kconfig 'imply' instead of 'depends on'
Per the kconfig.rst:

  A device should be listed [...] ``imply`` if (depending on
  the QEMU command line) the board may or  may not be started
  without it.

This is the case with the NVDIMM device, so use the 'imply'
weak reverse dependency to select the symbol.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210511155354.3069141-2-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-26 14:49:45 +02:00
Peter Maydell
6005ee07c3 pc,pci,virtio: bugfixes, improvements
Fixes all over the place. Faster boot for virtio. ioeventfd support for
 mmio.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmCeiMEPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpqsIH/A49Av5Bv8huL75lf9GzCx3E1a/z2W9Fphik
 OcQ1ahR+7CRDARub+vTG40MBmZBVefIWjLAj3BwBWzFGPX0DZq0zeI102VzlEVKY
 OeUx8ixuiKOSLcS+QxE7ZXIBL2Pn7l+MFUi4nLMYKti7c/kola7zlB57qsmXh+VD
 AOQ7Utj6NWoi6QocWJsMSCyHCh3Fk9QzcStLlr6/MkSJa1zqv8l22+8oWH07Fk2M
 wZfhrm9k094on28iSejsFYL5e4ROeXUajbOdfyMIxWvAB7boC9Jxk/e0oAbuSB4y
 2f71Gfk3mU6irS7PvrxcKbk6BVD2zxM2WumOchZJgxFAujDO6yg=
 =fvkT
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc,pci,virtio: bugfixes, improvements

Fixes all over the place. Faster boot for virtio. ioeventfd support for
mmio.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Fri 14 May 2021 15:27:13 BST
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  Fix build with 64 bits time_t
  vhost-vdpa: Make vhost_vdpa_get_device_id() static
  hw/virtio: enable ioeventfd configuring for mmio
  hw/smbios: support for type 41 (onboard devices extended information)
  checkpatch: Fix use of uninitialized value
  virtio-scsi: Configure all host notifiers in a single MR transaction
  virtio-scsi: Set host notifiers and callbacks separately
  virtio-blk: Configure all host notifiers in a single MR transaction
  virtio-blk: Fix rollback path in virtio_blk_data_plane_start()
  pc-dimm: remove unnecessary get_vmstate_memory_region() method
  amd_iommu: fix wrong MMIO operations
  virtio-net: Constify VirtIOFeature feature_sizes[]
  virtio-blk: Constify VirtIOFeature feature_sizes[]
  hw/virtio: Pass virtio_feature_get_config_size() a const argument
  x86: acpi: use offset instead of pointer when using build_header()
  amd_iommu: Fix pte_override_page_mask()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	hw/arm/virt.c
2021-05-16 17:22:46 +01:00
Vincent Bernat
05dfb447a4 hw/smbios: support for type 41 (onboard devices extended information)
Type 41 defines the attributes of devices that are onboard. The
original intent was to imply the BIOS had some level of control over
the enablement of the associated devices.

If network devices are present in this table, by default, udev will
name the corresponding interfaces enoX, X being the instance number.
Without such information, udev will fallback to using the PCI ID and
this usually gives ens3 or ens4. This can be a bit annoying as the
name of the network card may depend on the order of options and may
change if a new PCI device is added earlier on the commande line.
Being able to provide SMBIOS type 41 entry ensure the name of the
interface won't change and helps the user guess the right name without
booting a first time.

This can be invoked with:

    $QEMU -netdev user,id=internet
          -device virtio-net-pci,mac=50:54:00:00:00:42,netdev=internet,id=internet-dev \
          -smbios type=41,designation='Onboard LAN',instance=1,kind=ethernet,pcidev=internet-dev

The PCI segment is assumed to be 0. This should hold true for most
cases.

    $ dmidecode -t 41
    # dmidecode 3.3
    Getting SMBIOS data from sysfs.
    SMBIOS 2.8 present.

    Handle 0x2900, DMI type 41, 11 bytes
    Onboard Device
            Reference Designation: Onboard LAN
            Type: Ethernet
            Status: Enabled
            Type Instance: 1
            Bus Address: 0000:00:09.0

    $ ip -brief a
    lo               UNKNOWN        127.0.0.1/8 ::1/128
    eno1             UP             10.0.2.14/24 fec0::5254:ff:fe00:42/64 fe80::5254:ff:fe00:42/64

Signed-off-by: Vincent Bernat <vincent@bernat.ch>
Message-Id: <20210401171138.62970-1-vincent@bernat.ch>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-05-14 10:26:18 -04:00
Roman Kapl
e526ab61e9 amd_iommu: fix wrong MMIO operations
Address was swapped with value when writing MMIO registers, so the user
saw garbage in lot of cases. The interrupt status was not correctly set.

Signed-off-by: Roman Kapl <rka@sysgo.com>
Message-Id: <20210427110504.10878-1-rka@sysgo.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-05-14 10:26:18 -04:00
David Hildenbrand
8f44304c76 numa: Teach ram block notifiers about resizeable ram blocks
Ram block notifiers are currently not aware of resizes. To properly
handle resizes during migration, we want to teach ram block notifiers about
resizeable ram.

Introduce the basic infrastructure but keep using max_size in the
existing notifiers. Supply the max_size when adding and removing ram
blocks. Also, notify on resizes.

Acked-by: Paul Durrant <paul@xen.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: xen-devel@lists.xenproject.org
Cc: haxm-team@intel.com
Cc: Paul Durrant <paul@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Wenchao Wang <wenchao.wang@intel.com>
Cc: Colin Xu <colin.xu@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-3-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:13 +01:00
Peter Maydell
31589644ba * AccelCPUClass and sysemu/user split for i386 (Claudio)
* i386 page walk unification
 * Fix detection of gdbus-codegen
 * Misc refactoring
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmCblEEUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroObnQgAj10pRDODY9hIiUiYj2sTcEQTly3p
 DC+ZWDaup67z3WV2C/vAS/x31RGIus+7bzji3fgtUcnGOr7sbuOCcs7CPY8mam5Y
 GMPNrsUk2sZ5z9SVTq2vjEa61tjxtMpYXx9pnhgJzJAO4NJzNuX74ZdpA+oV5aTC
 CvZDk8lC7BLU16MfeLcbw44xE4Oy05wWwaoP2pvhdOg47y85t/S9Il1yBCYi3y8C
 pTOBBCYmHGPj/r7i4MhUGrAjIyGQu1w7av8nZXouRegoeVl28RKR8+pl7TfFpt+E
 cp95yE8dPuNFJnCiZ3Kv01eBcSUcyp4gVb5H2Oa/nkkYRLpnONbUzYFtIA==
 =zR5U
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging

* AccelCPUClass and sysemu/user split for i386 (Claudio)
* i386 page walk unification
* Fix detection of gdbus-codegen
* Misc refactoring

# gpg: Signature made Wed 12 May 2021 09:39:29 BST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini-gitlab/tags/for-upstream: (32 commits)
  coverity-scan: list components, move model to scripts/coverity-scan
  configure: fix detection of gdbus-codegen
  qemu-option: support accept-any QemuOptsList in qemu_opts_absorb_qdict
  main-loop: remove dead code
  target/i386: use mmu_translate for NPT walk
  target/i386: allow customizing the next phase of the translation
  target/i386: extend pg_mode to more CR0 and CR4 bits
  target/i386: pass cr3 to mmu_translate
  target/i386: extract mmu_translate
  target/i386: move paging mode constants from SVM to cpu.h
  target/i386: merge SVM_NPTEXIT_* with PF_ERROR_* constants
  accel: add init_accel_cpu for adapting accel behavior to CPU type
  accel: move call to accel_init_interfaces
  i386: make cpu_load_efer sysemu-only
  target/i386: gdbstub: only write CR0/CR2/CR3/EFER for sysemu
  target/i386: gdbstub: introduce aux functions to read/write CS64 regs
  i386: split off sysemu part of cpu.c
  i386: split seg_helper into user-only and sysemu parts
  i386: split svm_helper into sysemu and stub-only user
  i386: separate fpu_helper sysemu-only parts
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-12 16:07:50 +01:00
Claudio Fontana
f5cc5a5c16 i386: split cpu accelerators from cpu.c, using AccelCPUClass
i386 is the first user of AccelCPUClass, allowing to split
cpu.c into:

cpu.c            cpuid and common x86 cpu functionality
host-cpu.c       host x86 cpu functions and "host" cpu type
kvm/kvm-cpu.c    KVM x86 AccelCPUClass
hvf/hvf-cpu.c    HVF x86 AccelCPUClass
tcg/tcg-cpu.c    TCG x86 AccelCPUClass

Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

[claudio]:
Rebased on commit b8184135 ("target/i386: allow modifying TCG phys-addr-bits")

Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210322132800.7470-5-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-10 15:41:49 -04:00
Anthony PERARD
f1e43b6026 xen: Free xenforeignmemory_resource at exit
Because Coverity complains about it and this is one leak that Valgrind
reports.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Paul Durrant <paul@xen.org>
Message-Id: <20210430163742.469739-1-anthony.perard@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2021-05-10 13:43:58 +01:00
Igor Druzhinin
3e81a71c9f xen-mapcache: avoid a race on memory map while using MAP_FIXED
When we're replacing the existing mapping there is possibility of a race
on memory map with other threads doing mmap operations - the address being
unmapped/re-mapped could be occupied by another thread in between.

Linux mmap man page recommends keeping the existing mappings in place to
reserve the place and instead utilize the fact that the next mmap operation
with MAP_FIXED flag passed will implicitly destroy the existing mappings
behind the chosen address. This behavior is guaranteed by POSIX / BSD and
therefore is portable.

Note that it wouldn't make the replacement atomic for parallel accesses to
the replaced region - those might still fail with SIGBUS due to
xenforeignmemory_map not being atomic. So we're still not expecting those.

Tested-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <1618889702-13104-1-git-send-email-igor.druzhinin@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2021-05-10 13:43:58 +01:00
Igor Mammedov
bb9feea431 x86: acpi: use offset instead of pointer when using build_header()
Do the same as in commit
 (4d027afeb3 Virt: ACPI: fix qemu assert due to re-assigned table data address)
for remaining tables that happen to use saved at
the beginning pointer to build header to avoid assert
when table_data is relocated due to implicit re-size.

In this case user is trying to start Windows 10 and getting assert at
 hw/acpi/bios-linker-loader.c:239:
  bios_linker_loader_add_checksum: Assertion `start_offset < file->blob->len' failed.

Fixes: https://bugs.launchpad.net/bugs/1923497
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210414084356.3792113-1-imammedo@redhat.com>
Cc: mst@redhat.com, qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-05-04 07:27:37 -04:00
Jean-Philippe Brucker
5d31e1e59a amd_iommu: Fix pte_override_page_mask()
AMD IOMMU PTEs have a special mode allowing to specify an arbitrary page
size. Quoting the AMD IOMMU specification: "When the Next Level bits [of
a pte] are 7h, the size of the page is determined by the first zero bit
in the page address, starting from bit 12."

So if the lowest bits of the page address is 0, the page is 8kB. If the
lowest bits are 011, the page is 32kB. Currently pte_override_page_mask()
doesn't compute the right value for this page size and amdvi_translate()
can return the wrong guest-physical address. With a Linux guest, DMA
from SATA devices accesses the wrong memory and causes probe failure:

qemu-system-x86_64 ... -device amd-iommu -drive id=hd1,file=foo.bin,if=none \
		-device ahci,id=ahci -device ide-hd,drive=hd1,bus=ahci.0
[    6.613093] ata1.00: qc timeout (cmd 0xec)
[    6.615062] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)

Fix the page mask.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20210421084007.1190546-1-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-05-04 07:27:37 -04:00
Thomas Huth
ee86213aa3 Do not include exec/address-spaces.h if it's not really necessary
Stop including exec/address-spaces.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-5-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:51 +02:00
Thomas Huth
2068cabd3f Do not include cpu.h if it's not really necessary
Stop including cpu.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-4-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:51 +02:00
Thomas Huth
ead62c75f6 Do not include hw/boards.h if it's not really necessary
Stop including hw/boards.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-3-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:51 +02:00
Thomas Huth
4c386f8064 Do not include sysemu/sysemu.h if it's not really necessary
Stop including sysemu/sysemu.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-2-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:50 +02:00
Thomas Huth
e924921f5c hw: Do not include hw/irq.h if it is not necessary
Many files include hw/irq.h without needing it. Remove the superfluous
include statements.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20210327050236.2232347-1-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:50 +02:00
Cornelia Huck
da7e13c00b hw: add compat machines for 6.1
Add 6.1 machine types for arm/i440fx/q35/s390x/spapr.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Message-id: 20210331111900.118274-1-cohuck@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-04-30 11:16:51 +01:00
Marian Postevca
d07b22863b acpi: Move setters/getters of oem fields to X86MachineState
The code that sets/gets oem fields is duplicated in both PC and MICROVM
variants. This commit moves it to X86MachineState so that all x86
variants can use it and duplication is removed.

Signed-off-by: Marian Postevca <posteuca@mutex.one>
Message-Id: <20210221001737.24499-2-posteuca@mutex.one>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
David Hildenbrand
6930ba0d44 acpi: Move maximum size logic into acpi_add_rom_blob()
We want to have safety margins for all tables based on the table type.
Let's move the maximum size logic into acpi_add_rom_blob() and make it
dependent on the table name, so we don't have to replicate for each and
every instance that creates such tables.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-4-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
David Hildenbrand
2a3bdc5cec microvm: Don't open-code "etc/table-loader"
Let's just reuse ACPI_BUILD_LOADER_FILE.

Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-3-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
David Hildenbrand
6c2b24d1d2 acpi: Set proper maximum size for "etc/table-loader" blob
The resizeable memory region / RAMBlock that is created for the cmd blob
has a maximum size of whole host pages (e.g., 4k), because RAMBlocks
work on full host pages. In addition, in i386 ACPI code:
  acpi_align_size(tables->linker->cmd_blob, ACPI_BUILD_ALIGN_SIZE);
makes sure to align to multiples of 4k, padding with 0.

For example, if our cmd_blob is created with a size of 2k, the maximum
size is 4k - we cannot grow beyond that. Growing might be required
due to guest action when rebuilding the tables, but also on incoming
migration.

This automatic generation of the maximum size used to be sufficient,
however, there are cases where we cross host pages now when growing at
runtime: we exceed the maximum size of the RAMBlock and can crash QEMU when
trying to resize the resizeable memory region / RAMBlock:
  $ build/qemu-system-x86_64 --enable-kvm \
      -machine q35,nvdimm=on \
      -smp 1 \
      -cpu host \
      -m size=2G,slots=8,maxmem=4G \
      -object memory-backend-file,id=mem0,mem-path=/tmp/nvdimm,size=256M \
      -device nvdimm,label-size=131072,memdev=mem0,id=nvdimm0,slot=1 \
      -nodefaults \
      -device vmgenid \
      -device intel-iommu

Results in:
  Unexpected error in qemu_ram_resize() at ../softmmu/physmem.c:1850:
  qemu-system-x86_64: Size too large: /rom@etc/table-loader:
    0x2000 > 0x1000: Invalid argument

In this configuration, we consume exactly 4k (32 entries, 128 bytes each)
when creating the VM. However, once the guest boots up and maps the MCFG,
we also create the MCFG table and end up consuming 2 additional entries
(pointer + checksum) -- which is where we try resizing the memory region
/ RAMBlock, however, the maximum size does not allow for it.

Currently, we get the following maximum sizes for our different
mutable tables based on behavior of resizeable RAMBlock:

  hw       table                max_size
  -------  ---------------------------------------------------------

  virt     "etc/acpi/tables"    ACPI_BUILD_TABLE_MAX_SIZE (0x200000)
  virt     "etc/table-loader"   HOST_PAGE_ALIGN(initial_size)
  virt     "etc/acpi/rsdp"      HOST_PAGE_ALIGN(initial_size)

  i386     "etc/acpi/tables"    ACPI_BUILD_TABLE_MAX_SIZE (0x200000)
  i386     "etc/table-loader"   HOST_PAGE_ALIGN(initial_size)
  i386     "etc/acpi/rsdp"      HOST_PAGE_ALIGN(initial_size)

  microvm  "etc/acpi/tables"    ACPI_BUILD_TABLE_MAX_SIZE (0x200000)
  microvm  "etc/table-loader"   HOST_PAGE_ALIGN(initial_size)
  microvm  "etc/acpi/rsdp"      HOST_PAGE_ALIGN(initial_size)

Let's set the maximum table size for "etc/table-loader" to 64k, so we
can properly grow at runtime, which should be good enough for the future.

Migration is not concerned with the maximum size of a RAMBlock, only
with the used size - so existing setups are not affected. Of course, we
cannot migrate a VM that would have crash when started on older QEMU from
new QEMU to older QEMU without failing early on the destination when
synchronizing the RAM state:
    qemu-system-x86_64: Size too large: /rom@etc/table-loader: 0x2000 > 0x1000: Invalid argument
    qemu-system-x86_64: error while loading state for instance 0x0 of device 'ram'
    qemu-system-x86_64: load of migration failed: Invalid argument

We'll refactor the code next, to make sure we get rid of this implicit
behavior for "etc/acpi/rsdp" as well and to make the code easier to
grasp.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-2-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
Igor Mammedov
b7f23f62e4 pci: acpi: add _DSM method to PCI devices
Implement _DSM according to:
    PCI Firmware Specification 3.1
    4.6.7.  DSM for Naming a PCI or PCI Express Device Under
            Operating Systems
and wire it up to cold and hot-plugged PCI devices.
Feature depends on ACPI hotplug being enabled (as that provides
PCI devices descriptions in ACPI and MMIO registers that are
reused to fetch acpi-index).

acpi-index should work for
  - cold plugged NICs:
      $QEMU -device e1000,acpi-index=100
         => 'eno100'
  - hot-plugged
      (monitor) device_add e1000,acpi-index=200,id=remove_me
         => 'eno200'
  - re-plugged
      (monitor) device_del remove_me
      (monitor) device_add e1000,acpi-index=1
         => 'eno1'

Windows also sees index under "PCI Label Id" field in properties
dialog but otherwise it doesn't seem to have any effect.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-6-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
Igor Mammedov
b32bd763a1 pci: introduce acpi-index property for PCI device
In x86/ACPI world, linux distros are using predictable
network interface naming since systemd v197. Which on
QEMU based VMs results into path based naming scheme,
that names network interfaces based on PCI topology.

With itm on has to plug NIC in exactly the same bus/slot,
which was used when disk image was first provisioned/configured
or one risks to loose network configuration due to NIC being
renamed to actually used topology.
That also restricts freedom to reshape PCI configuration of
VM without need to reconfigure used guest image.

systemd also offers "onboard" naming scheme which is
preferred over PCI slot/topology one, provided that
firmware implements:
    "
    PCI Firmware Specification 3.1
    4.6.7.  DSM for Naming a PCI or PCI Express Device Under
            Operating Systems
    "
that allows to assign user defined index to PCI device,
which systemd will use to name NIC. For example, using
  -device e1000,acpi-index=100
guest will rename NIC to 'eno100', where 'eno' is default
prefix for "onboard" naming scheme. This doesn't require
any advance configuration on guest side to com in effect
at 'onboard' scheme takes priority over path based naming.

Hope is that 'acpi-index' it will be easier to consume by
management layer, compared to forcing specific PCI topology
and/or having several disk image templates for different
topologies and will help to simplify process of spawning
VM from the same template without need to reconfigure
guest NIC.

This patch adds, 'acpi-index'* property and wires up
a 32bit register on top of pci hotplug register block
to pass index value to AML code at runtime.
Following patch will add corresponding _DSM code and
wire it up to PCI devices described in ACPI.

*) name comes from linux kernel terminology

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
Daniel P. Berrangé
879be3af49 hw/scsi: remove 'scsi-disk' device
The 'scsi-hd' and 'scsi-cd' devices provide suitable alternatives.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-03-18 09:22:55 +00:00
Daniel P. Berrangé
b501018339 hw/ide: remove 'ide-drive' device
The 'ide-hd' and 'ide-cd' devices provide suitable alternatives.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2021-03-18 09:22:55 +00:00
Eric Auger
f14fb6c2db dma: Introduce dma_aligned_pow2_mask()
Currently get_naturally_aligned_size() is used by the intel iommu
to compute the maximum invalidation range based on @size which is
a power of 2 while being aligned with the @start address and less
than the maximum range defined by @gaw.

This helper is also useful for other iommu devices (virtio-iommu,
SMMUv3) to make sure IOMMU UNMAP notifiers only are called with
power of 2 range sizes.

Let's move this latter into dma-helpers.c and rename it into
dma_aligned_pow2_mask(). Also rewrite the helper so that it
accomodates UINT64_MAX values for the size mask and max mask.
It now returns a mask instead of a size. Change the caller.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-id: 20210309102742.30442-3-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-12 12:40:10 +00:00
Eric Auger
41ce9a9126 intel_iommu: Fix mask may be uninitialized in vtd_context_device_invalidate
With -Werror=maybe-uninitialized configuration we get
../hw/i386/intel_iommu.c: In function ‘vtd_context_device_invalidate’:
../hw/i386/intel_iommu.c:1888:10: error: ‘mask’ may be used
uninitialized in this function [-Werror=maybe-uninitialized]
 1888 |     mask = ~mask;
      |     ~~~~~^~~~~~~

Add a g_assert_not_reached() to avoid the error.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210309102742.30442-2-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-12 12:40:09 +00:00
Peter Maydell
6f34661b6c Pull request
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmBJQHkSHGxhdXJlbnRA
 dml2aWVyLmV1AAoJEPMMOL0/L748EdsP/2U2CGTM95tjDunTs9uZV/7zM6PWt85M
 vAPItNVU2jYPfzmaJN8twrzlj0PEDhvB9Q+OJjE4HEGxEbPcdblLg/R6Zs/EaWuY
 N6oKHPXnOnHb+e80UUJdiAq+Y5RUnJbb5L3ArycnVzBgws+Oj3DtqjB2VDccY4C/
 Gkt23tZ7ikU4958e5VBqW2NUUrr+BQO0mqsW+sbbeE3WPj75NQc6srvS3TWvsg7W
 OYEyVYwm52/q2W/1a3Knfv/YO6UU9NGMpGyDLD2kwQwKbgUWYLW2BiWVwOAUldo9
 De3nfKbKnFezLCZAZro20lfCa/aKwNGCOXWzlrKxqUQCmGYUx7gM1+3ahrSd5N0v
 zUgLdZm7O428ZHL6GujWGLA1UwwzpM9X3P3yo4c0S1J6fHypbI6a9jtewrUFvFgP
 TuQ7dp6cn2DTBYUcsrWilPHbTZMADYQNRD/xUtKqalYBEWy3FX5W75+OYBJKKh+X
 Qip68m6JBzgkszXhCcu6xlLb8ynZJr2VsHvtvIgf4NnLqNOIEgVLcMtoMZT8DPrp
 rIoRc5oUFz8zj5lHnJuLADBUvlCMqoCCoU3h2aqHwH8a7RGb180f+82BW9aBcb2u
 Jk+WgAhBUjWBBC97ReFgrINUD/qZRXVoOq8LthTuQSSyr/i1zq+oLM1F0EDXcMDm
 ssATku2IxL24
 =moUF
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/trivial-branch-for-6.0-pull-request' into staging

Pull request

# gpg: Signature made Wed 10 Mar 2021 21:56:09 GMT
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/trivial-branch-for-6.0-pull-request: (22 commits)
  sysemu: Let VMChangeStateHandler take boolean 'running' argument
  sysemu/runstate: Let runstate_is_running() return bool
  hw/lm32/Kconfig: Have MILKYMIST select LM32_DEVICES
  hw/lm32/Kconfig: Rename CONFIG_LM32 -> CONFIG_LM32_DEVICES
  hw/lm32/Kconfig: Introduce CONFIG_LM32_EVR for lm32-evr/uclinux boards
  qemu-common.h: Update copyright string to 2021
  tests/fp/fp-test: Replace the word 'blacklist'
  qemu-options: Replace the word 'blacklist'
  seccomp: Replace the word 'blacklist'
  scripts/tracetool: Replace the word 'whitelist'
  ui: Replace the word 'whitelist'
  virtio-gpu: Adjust code space style
  exec/memory: Use struct Object typedef
  fuzz-test: remove unneccessary debugging flags
  net: Use id_generate() in the network subsystem, too
  MAINTAINERS: Fix the location of tools manuals
  vhost_user_gpu: Drop dead check for g_malloc() failure
  backends/dbus-vmstate: Fix short read error handling
  target/hexagon/gen_tcg_funcs: Fix a typo
  hw/elf_ops: Fix a typo
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-03-11 18:55:27 +00:00
Philippe Mathieu-Daudé
538f049704 sysemu: Let VMChangeStateHandler take boolean 'running' argument
The 'running' argument from VMChangeStateHandler does not require
other value than 0 / 1. Make it a plain boolean.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210111152020.1422021-3-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-03-09 23:13:57 +01:00
Chen Qun
d6eb39b554 qtest: delete superfluous inclusions of qtest.h
There are 23 files that include the "sysemu/qtest.h",
but they do not use any qtest functions.

Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210226081414.205946-1-kuhn.chenqun@huawei.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-03-09 06:03:53 +01:00
David Edmondson
e20e182ea0 x86/pvh: extract only 4 bytes of start address for 32 bit kernels
When loading the PVH start address from a 32 bit ELF note, extract
only the appropriate number of bytes.

Fixes: ab969087da ("pvh: Boot uncompressed kernel using direct boot ABI")
Signed-off-by: David Edmondson <david.edmondson@oracle.com>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20210302090315.3031492-3-david.edmondson@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-03-06 11:41:54 +01:00
Vitaly Cheptsov
0a343a5add i386/acpi: restore device paths for pre-5.1 vms
After fixing the _UID value for the primary PCI root bridge in
af1b80ae it was discovered that this change updates Windows
configuration in an incompatible way causing network configuration
failure unless DHCP is used. More details provided on the list:

https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg08484.html

This change reverts the _UID update from 1 to 0 for q35 and i440fx
VMs before version 5.2 to maintain the original behaviour when
upgrading.

Cc: qemu-stable@nongnu.org
Cc: qemu-devel@nongnu.org
Reported-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Vitaly Cheptsov <cheptsov@ispras.ru>
Message-Id: <20210301195919.9333-1-cheptsov@ispras.ru>
Tested-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fixes: af1b80ae56 ("i386/acpi: fix inconsistent QEMU/OVMF device paths")
2021-03-02 05:40:35 -05:00
Sean Christopherson
51124bbfd2 i386: acpi: Don't build HPET ACPI entry if HPET is disabled
Omit HPET AML if the HPET is disabled, QEMU is not emulating it and the
guest may get confused by seeing HPET in the ACPI tables without a
"physical" device present.

The change of DSDT when -no-hpet is as follows.

@@ -141,47 +141,6 @@ DefinitionBlock ("", "DSDT", 1, "BOCHS "
         }
     }

-    Scope (_SB)
-    {
-        Device (HPET)
-        {
-            Name (_HID, EisaId ("PNP0103") /* HPET System Timer */)  // _HID: Hardware ID
-            Name (_UID, Zero)  // _UID: Unique ID
-            OperationRegion (HPTM, SystemMemory, 0xFED00000, 0x0400)
-            Field (HPTM, DWordAcc, Lock, Preserve)
-            {
-                VEND,   32,
-                PRD,    32
-            }
-
-            Method (_STA, 0, NotSerialized)  // _STA: Status
-            {
-                Local0 = VEND /* \_SB_.HPET.VEND */
-                Local1 = PRD /* \_SB_.HPET.PRD_ */
-                Local0 >>= 0x10
-                If (((Local0 == Zero) || (Local0 == 0xFFFF)))
-                {
-                    Return (Zero)
-                }
-
-                If (((Local1 == Zero) || (Local1 > 0x05F5E100)))
-                {
-                    Return (Zero)
-                }
-
-                Return (0x0F)
-            }
-
-            Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
-            {
-                Memory32Fixed (ReadOnly,
-                    0xFED00000,         // Address Base
-                    0x00000400,         // Address Length
-                    )
-            })
-        }
-    }
-
     Scope (_SB.PCI0)
     {
         Device (ISA)

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <66114dead09232d04891b9e5f5a4081e85cc2c4d.1613615732.git.isaku.yamahata@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23 10:58:42 -05:00
Isaku Yamahata
e3fb55f065 hw/i386: declare ACPI mother board resource for MMCONFIG region
Declare PNP0C01 device to reserve MMCONFIG region to conform to the
spec better and play nice with guest BIOSes/OSes.

According to PCI Firmware Specification[0], MMCONFIG region must be
reserved by declaring a motherboard resource. It's optional to reserve
the region in memory map by Int 15 E820h or EFIGetMemoryMap.
Guest Linux checks if the MMCFG region is reserved by bios memory map
or ACPI resource. If it's not reserved, Linux falls back to legacy PCI
configuration access.

TDVF [1] [2] doesn't reserve MMCONFIG the region in memory map.
On the other hand OVMF reserves it in memory map without declaring a
motherboard resource. With memory map reservation, linux guest uses
MMCONFIG region. However it doesn't comply to PCI Firmware
specification.

[0] PCI Firmware specification Revision 3.2
  4.1.2 MCFG Table Description table 4-2 NOTE 2
  If the operating system does not natively comprehend reserving the
  MMCFG region, The MMCFG region must e reserved by firmware. ...
  For most systems, the mortheroard resource would appear at the root
  of the ACPI namespace (under \_SB)...
  The resource can optionally be returned in Int15 E820h or
  EFIGetMemoryMap as reserved memory but must always be reported
  through ACPI as a motherboard resource

[1] TDX: Intel Trust Domain Extension
    https://software.intel.com/content/www/us/en/develop/articles/intel-trust-domain-extensions.html
[2] TDX Virtual Firmware
    https://github.com/tianocore/edk2-staging/tree/TDVF

The change to DSDT is as follows.
@@ -68,32 +68,47 @@

                     If ((CDW3 != Local0))
                     {
                         CDW1 |= 0x10
                     }

                     CDW3 = Local0
                 }
                 Else
                 {
                     CDW1 |= 0x04
                 }

                 Return (Arg3)
             }
         }
+
+        Device (DRAC)
+        {
+            Name (_HID, "PNP0C01" /* System Board */)  // _HID: Hardware ID
+            Name (_CRS, ResourceTemplate ()  // _CRS: Current Resource Settings
+            {
+                DWordMemory (ResourceProducer, PosDecode, MinFixed, MaxFixed, NonCacheable, ReadWrite,
+                    0x00000000,         // Granularity
+                    0xB0000000,         // Range Minimum
+                    0xBFFFFFFF,         // Range Maximum
+                    0x00000000,         // Translation Offset
+                    0x10000000,         // Length
+                    ,, , AddressRangeMemory, TypeStatic)
+            })
+        }
     }

     Scope (_SB)
     {
         Device (HPET)
         {
             Name (_HID, EisaId ("PNP0103") /* HPET System Timer */)  // _HID: Hardware ID
             Name (_UID, Zero)  // _UID: Unique ID
             OperationRegion (HPTM, SystemMemory, 0xFED00000, 0x0400)
             Field (HPTM, DWordAcc, Lock, Preserve)
             {
                 VEND,   32,
                 PRD,    32
             }

             Method (_STA, 0, NotSerialized)  // _STA: Status

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-Id: <6f686b45ce7bc43048c56dbb46e72e1fe51927e6.1613615732.git.isaku.yamahata@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2021-02-23 10:58:42 -05:00
Isaku Yamahata
33b44fdaba acpi: set fadt.smi_cmd to zero when SMM is not supported
>From table 5.9 SMI_CMD of ACPI spec
> This field is reserved and must be zero on system
> that does not support System Management mode.

When smm is not enabled, set it to zero to comform to the spec.
When -machine smm=off is passed, the change to FACP is as follows.

@@ -1,46 +1,46 @@
 /*
  * Intel ACPI Component Architecture
  * AML/ASL+ Disassembler version 20180105 (64-bit version)
  * Copyright (c) 2000 - 2018 Intel Corporation
  *
- * Disassembly of tests/data/acpi/q35/FACP, Fri Feb  5 16:57:04 2021
+ * Disassembly of /tmp/aml-1OQYX0, Fri Feb  5 16:57:04 2021
  *
  * ACPI Data Table [FACP]
  *
  * Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue
  */

 [000h 0000   4]                    Signature : "FACP"    [Fixed ACPI Description Table (FADT)]
 [004h 0004   4]                 Table Length : 000000F4
 [008h 0008   1]                     Revision : 03
-[009h 0009   1]                     Checksum : 1F
+[009h 0009   1]                     Checksum : D6
 [00Ah 0010   6]                       Oem ID : "BOCHS "
 [010h 0016   8]                 Oem Table ID : "BXPCFACP"
 [018h 0024   4]                 Oem Revision : 00000001
 [01Ch 0028   4]              Asl Compiler ID : "BXPC"
 [020h 0032   4]        Asl Compiler Revision : 00000001

 [024h 0036   4]                 FACS Address : 00000000
 [028h 0040   4]                 DSDT Address : 00000000
 [02Ch 0044   1]                        Model : 01
 [02Dh 0045   1]                   PM Profile : 00 [Unspecified]
 [02Eh 0046   2]                SCI Interrupt : 0009
-[030h 0048   4]             SMI Command Port : 000000B2
-[034h 0052   1]            ACPI Enable Value : 02
-[035h 0053   1]           ACPI Disable Value : 03
+[030h 0048   4]             SMI Command Port : 00000000
+[034h 0052   1]            ACPI Enable Value : 00
+[035h 0053   1]           ACPI Disable Value : 00
 [036h 0054   1]               S4BIOS Command : 00
 [037h 0055   1]              P-State Control : 00
 [038h 0056   4]     PM1A Event Block Address : 00000600
 [03Ch 0060   4]     PM1B Event Block Address : 00000000
 [040h 0064   4]   PM1A Control Block Address : 00000604
 [044h 0068   4]   PM1B Control Block Address : 00000000
 [048h 0072   4]    PM2 Control Block Address : 00000000
 [04Ch 0076   4]       PM Timer Block Address : 00000608
 [050h 0080   4]           GPE0 Block Address : 00000620
 [054h 0084   4]           GPE1 Block Address : 00000000
 [058h 0088   1]       PM1 Event Block Length : 04
 [059h 0089   1]     PM1 Control Block Length : 02
 [05Ah 0090   1]     PM2 Control Block Length : 00
 [05Bh 0091   1]        PM Timer Block Length : 04
 [05Ch 0092   1]            GPE0 Block Length : 10
 [05Dh 0093   1]            GPE1 Block Length : 00

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-Id: <09ed791ef77fda2b194100669cbc690865c9eb52.1613615732.git.isaku.yamahata@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-23 10:58:42 -05:00
Gan Qixin
dbb6b0c78b vmmouse: put it into the 'input' category
The category of the vmmouse device is not set, put it into the 'input'
category.

Signed-off-by: Gan Qixin <ganqixin@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20201130083630.2520597-4-ganqixin@huawei.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-02-20 12:36:19 +01:00
Philippe Mathieu-Daudé
6661d9a58a hw/i386/xen: Remove dead code
'drivers_blacklisted' is never accessed, remove it.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <20210202155644.998812-1-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-02-20 12:36:19 +01:00
Paolo Bonzini
b2f73a0784 sev/i386: Allow AP booting under SEV-ES
When SEV-ES is enabled, it is not possible modify the guests register
state after it has been initially created, encrypted and measured.

Normally, an INIT-SIPI-SIPI request is used to boot the AP. However, the
hypervisor cannot emulate this because it cannot update the AP register
state. For the very first boot by an AP, the reset vector CS segment
value and the EIP value must be programmed before the register has been
encrypted and measured. Search the guest firmware for the guest for a
specific GUID that tells Qemu the value of the reset vector to use.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <22db2bfb4d6551aed661a9ae95b4fdbef613ca21.1611682609.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-16 17:15:39 +01:00
James Bottomley
9617cddb72 pc: add parser for OVMF reset block
OVMF is developing a mechanism for depositing a GUIDed table just
below the known location of the reset vector.  The table goes
backwards in memory so all entries are of the form

<data>|len|<GUID>

Where <data> is arbtrary size and type, <len> is a uint16_t and
describes the entire length of the entry from the beginning of the
data to the end of the guid.

The foot of the table is of this form and <len> for this case
describes the entire size of the table.  The table foot GUID is
defined by OVMF as 96b582de-1fb2-45f7-baea-a366c55a082d and if the
table is present this GUID is just below the reset vector, 48 bytes
before the end of the firmware file.

Add a parser for the ovmf reset block which takes a copy of the block,
if the table foot guid is found, minus the footer and a function for
later traversal to return the data area of any specified GUIDs.

Signed-off-by: James Bottomley <jejb@linux.ibm.com>

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20210204193939.16617-2-jejb@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-02-16 17:15:39 +01:00
David Gibson
aacdb84413 sev: Remove false abstraction of flash encryption
When AMD's SEV memory encryption is in use, flash memory banks (which are
initialed by pc_system_flash_map()) need to be encrypted with the guest's
key, so that the guest can read them.

That's abstracted via the kvm_memcrypt_encrypt_data() callback in the KVM
state.. except, that it doesn't really abstract much at all.

For starters, the only call site is in code specific to the 'pc'
family of machine types, so it's obviously specific to those and to
x86 to begin with.  But it makes a bunch of further assumptions that
need not be true about an arbitrary confidential guest system based on
memory encryption, let alone one based on other mechanisms:

 * it assumes that the flash memory is defined to be encrypted with the
   guest key, rather than being shared with hypervisor
 * it assumes that that hypervisor has some mechanism to encrypt data into
   the guest, even though it can't decrypt it out, since that's the whole
   point
 * the interface assumes that this encrypt can be done in place, which
   implies that the hypervisor can write into a confidential guests's
   memory, even if what it writes isn't meaningful

So really, this "abstraction" is actually pretty specific to the way SEV
works.  So, this patch removes it and instead has the PC flash
initialization code call into a SEV specific callback.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2021-02-08 16:57:38 +11:00
Michael S. Tsirkin
43e229a52b acpi: use constants as strncpy limit
gcc is not smart enough to figure out length was validated before use as
strncpy limit, resulting in this warning:

inlined from ‘virt_set_oem_table_id’ at ../../hw/arm/virt.c:2197:5:
/usr/include/aarch64-linux-gnu/bits/string_fortified.h:106:10: error:
‘__builtin_strncpy’ specified bound depends on the length of the
source argument [-Werror=stringop-overflow=]

Simplify things by using a constant limit instead.

Fixes: 97fc5d507fca ("acpi: Permit OEM ID and OEM table ID fields to be changed")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-05 08:52:59 -05:00
Marian Postevca
602b458201 acpi: Permit OEM ID and OEM table ID fields to be changed
Qemu's ACPI table generation sets the fields OEM ID and OEM table ID
to "BOCHS " and "BXPCxxxx" where "xxxx" is replaced by the ACPI
table name.

Some games like Red Dead Redemption 2 seem to check the ACPI OEM ID
and OEM table ID for the strings "BOCHS" and "BXPC" and if they are
found, the game crashes(this may be an intentional detection
mechanism to prevent playing the game in a virtualized environment).

This patch allows you to override these default values.

The feature can be used in this manner:
qemu -machine oem-id=ABCDEF,oem-table-id=GHIJKLMN

The oem-id string can be up to 6 bytes in size, and the
oem-table-id string can be up to 8 bytes in size. If the string are
smaller than their respective sizes they will be padded with space.
If either of these parameters is not set, the current default values
will be used for the one missing.

Note that the the OEM Table ID field will not be extended with the
name of the table, but will use either the default name or the user
provided one.

This does not affect the -acpitable option (for user-defined ACPI
tables), which has precedence over -machine option.

Signed-off-by: Marian Postevca <posteuca@mutex.one>
Message-Id: <20210119003216.17637-3-posteuca@mutex.one>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-02-05 08:52:59 -05:00
Thomas Huth
f862ddbb1a hw/i386: Remove the deprecated pc-1.x machine types
They have been deprecated since QEMU v5.0, time to remove them now.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210203171832.483176-2-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2021-02-05 08:52:59 -05:00