Commit Graph

1954 Commits

Author SHA1 Message Date
Igor Mammedov
d7346e614f acpi: x86: pcihp: add support hotplug on multifunction bridges
Commit [1] switched PCI hotplug from native to ACPI one by default.

That however breaks hotplug on following CLI that used to work:
   -nodefaults -machine q35 \
   -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
   -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2

where PCI device is hotplugged to pcie-root-port-1 with error on guest side:

  ACPI BIOS Error (bug): Could not resolve symbol [^S0B.PCNT], AE_NOT_FOUND (20201113/psargs-330)
  ACPI Error: Aborting method \_SB.PCI0.PCNT due to previous error (AE_NOT_FOUND) (20201113/psparse-531)
  ACPI Error: Aborting method \_GPE._E01 due to previous error (AE_NOT_FOUND) (20201113/psparse-531)
  ACPI Error: AE_NOT_FOUND, while evaluating GPE method [_E01] (20201113/evgpe-515)

cause is that QEMU's ACPI hotplug never supported functions other then 0
and due to bug it was generating notification entries for not described
functions.

Technically there is no reason not to describe cold-plugged bridges
(root ports) on functions other then 0, as they similarly to bridge
on function 0 are unpluggable.

So since we need to describe multifunction devices iterate over
fuctions as well. But describe only cold-plugged bridges[root ports]
on functions other than 0 as well.

1)
Fixes: 17858a1695 (hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20210723090424.2092226-1-imammedo@redhat.com>
Fixes: 17858a1695 (hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35)<br>
Signed-off-by: Igor Mammedov &lt;<a href="mailto:imammedo@redhat.com" target="_blank">imammedo@redhat.com</a>&gt;<br>
Reported-by: Laurent Vivier &lt;<a href="mailto:lvivier@redhat.com" target="_blank">lvivier@redhat.com</a>&gt;<br>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-08-03 16:31:07 -04:00
Philippe Mathieu-Daudé
df90457cf5 hw/i386/Kconfig: Add missing Kconfig dependency (runtime error)
When building the 'microvm' machine stand-alone we get:

  $ qemu-system-x86_64 -M microvm
  **
  ERROR:qom/object.c:714:object_new_with_type: assertion failed: (type != NULL)
  Bail out! ERROR:qom/object.c:714:object_new_with_type: assertion failed: (type != NULL)
  Aborted (core dumped)

Looking at the backtrace:

  (gdb) bt
  #3  0x00007ff2330492ff in g_assertion_message_expr () at /lib64/libglib-2.0.so.0
  #4  0x000055a878c18341 in object_new_with_type (type=<optimized out>) at qom/object.c:714
  #5  0x000055a878c18399 in object_new (typename=typename@entry=0x55a878dec36a "isa-pit") at qom/object.c:747
  #6  0x000055a878cc8146 in qdev_new (name=name@entry=0x55a878dec36a "isa-pit") at hw/core/qdev.c:153
  #7  0x000055a878a8b439 in isa_new (name=name@entry=0x55a878dec36a "isa-pit") at hw/isa/isa-bus.c:160
  #8  0x000055a878adb782 in i8254_pit_init (base=64, isa_irq=0, alt_irq=0x0, bus=0x55a87ab38760) at include/hw/timer/i8254.h:54
  #9  microvm_devices_init (mms=0x55a87ac36800) at hw/i386/microvm.c:263
  #10 microvm_machine_state_init (machine=<optimized out>) at hw/i386/microvm.c:471
  #11 0x000055a878a944ab in machine_run_board_init (machine=machine@entry=0x55a87ac36800) at hw/core/machine.c:1239

The "isa-pit" type (TYPE_I8254) is missing. Add it.

Fixes: 0ebf007dda ("hw/i386: Introduce the microvm machine type")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20210616204328.2611406-24-philmd@redhat.com>
2021-07-20 15:29:46 +02:00
Xingang Wang
dec2f5636e hw/i386/acpi-build: Add IVRS support to bypass iommu
Check bypass_iommu to exclude the devices which will bypass iommu.

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Message-Id: <1625748919-52456-9-git-send-email-wangxingang5@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-16 11:10:45 -04:00
Xingang Wang
26863366b2 hw/i386/acpi-build: Add DMAR support to bypass iommu
In DMAR table, the drhd is set to cover all PCI devices when intel_iommu
is on. To support bypass iommu feature, we need to walk the PCI bus with
bypass_iommu disabled and add explicit scope data in DMAR drhd structure.

/mnt/sdb/wxg/qemu-next/qemu/build/x86_64-softmmu/qemu-system-x86_64 \
 -machine q35,accel=kvm,default_bus_bypass_iommu=true \
 -cpu host \
 -m 16G \
 -smp 36,sockets=2,cores=18,threads=1 \
 -device pxb-pcie,bus_nr=0x10,id=pci.10,bus=pcie.0,addr=0x3 \
 -device pxb-pcie,bus_nr=0x20,id=pci.20,bus=pcie.0,addr=0x4,bypass_iommu=true \
 -device pcie-root-port,port=0x1,chassis=1,id=pci.11,bus=pci.10,addr=0x0 \
 -device pcie-root-port,port=0x2,chassis=2,id=pci.21,bus=pci.20,addr=0x0 \
 -device virtio-scsi-pci,id=scsi0,bus=pci.11,addr=0x0 \
 -device virtio-scsi-pci,id=scsi1,bus=pci.21,addr=0x0 \
 -drive file=/mnt/sdb/wxg/fedora-48g.qcow2,format=qcow2,if=none,id=drive-scsi0-0-0-0,cache=none,aio=native \
 -device scsi-hd,bus=scsi1.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1 \
 -device intel-iommu \
 -nographic \

And we get the guest configuration:

~ lspci -vt
-+-[0000:20]---00.0-[21]----00.0  Red Hat, Inc. Virtio SCSI
 +-[0000:10]---00.0-[11]----00.0  Red Hat, Inc. Virtio SCSI
 \-[0000:00]-+-00.0  Intel Corporation 82G33/G31/P35/P31 Express DRAM Controller
             +-01.0  Device 1234:1111
             +-02.0  Intel Corporation 82574L Gigabit Network Connection
             +-03.0  Red Hat, Inc. QEMU PCIe Expander bridge
             +-04.0  Red Hat, Inc. QEMU PCIe Expander bridge
             +-1f.0  Intel Corporation 82801IB (ICH9) LPC Interface Controller
             +-1f.2  Intel Corporation 82801IR/IO/IH (ICH9R/DO/DH) 6 port SATA Controller [AHCI mode]
             \-1f.3  Intel Corporation 82801I (ICH9 Family) SMBus Controller

With bypass_iommu enabled on root bus, the attached devices will bypass iommu:

/sys/class/iommu/dmar0
├── devices
│   ├── 0000:10:00.0 -> ../../../../pci0000:10/0000:10:00.0
│   └── 0000:11:00.0 -> ../../../../pci0000:10/0000:10:00.0/0000:11:00.0

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Message-Id: <1625748919-52456-8-git-send-email-wangxingang5@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-16 11:10:45 -04:00
Xingang Wang
c9e96b04fc hw/i386: Add a default_bus_bypass_iommu pc machine option
Add a default_bus_bypass_iommu pc machine option to enable/disable
bypass_iommu for default root bus. The option is disabled by default
and can be enabled with:
$QEMU -machine q35,default_bus_bypass_iommu=true

Signed-off-by: Xingang Wang <wangxingang5@huawei.com>
Message-Id: <1625748919-52456-5-git-send-email-wangxingang5@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-16 11:10:45 -04:00
Julia Suvorova
17858a1695 hw/acpi/ich9: Set ACPI PCI hot-plug as default on Q35
Q35 has three different types of PCI devices hot-plug: PCIe Native,
SHPC Native and ACPI hot-plug. This patch changes the default choice
for cold-plugged bridges from PCIe Native to ACPI Hot-plug with
ability to use SHPC and PCIe Native for hot-plugged bridges.

This is a list of the PCIe Native hot-plug issues that led to this
change:
    * no racy behavior during boot (see 110c477c2e)
    * no delay during deleting - after the actual power off software
      must wait at least 1 second before indicating about it. This case
      is quite important for users, it even has its own bug:
          https://bugzilla.redhat.com/show_bug.cgi?id=1594168
    * no timer-based behavior - in addition to the previous example,
      the attention button has a 5-second waiting period, during which
      the operation can be canceled with a second press. While this
      looks fine for manual button control, automation will result in
      the need to queue or drop events, and the software receiving
      events in all sort of unspecified combinations of attention/power
      indicator states, which is racy and uppredictable.
    * fixes:
        * https://bugzilla.redhat.com/show_bug.cgi?id=1752465
        * https://bugzilla.redhat.com/show_bug.cgi?id=1690256

To return to PCIe Native hot-plug:
    -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off

Known issue: older linux guests need the following flag
to allow hotplugged pci express devices to use io:
        -device pcie-root-port,io-reserve=4096.
io is unusual for pci express so this seems minor.
We'll fix this by a follow up patch.

Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210713004205.775386-6-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-16 04:34:22 -04:00
Julia Suvorova
3f3cbbb236 hw/pci/pcie: Do not set HPC flag if acpihp is used
Instead of changing the hot-plug type in _OSC register, do not
set the 'Hot-Plug Capable' flag. This way guest will choose ACPI
hot-plug if it is preferred and leave the option to use SHPC with
pcie-pci-bridge.

The ability to control hot-plug for each downstream port is retained,
while 'hotplug=off' on the port means all hot-plug types are disabled.

Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20210713004205.775386-4-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-16 04:33:35 -04:00
Julia Suvorova
c0e427d6eb hw/acpi/ich9: Enable ACPI PCI hot-plug
Add acpi_pcihp to ich9_pm as part of
'acpi-pci-hotplug-with-bridge-support' option. Set default to false.

Signed-off-by: Julia Suvorova <jusual@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210713004205.775386-3-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-16 04:33:35 -04:00
Julia Suvorova
caf108bc58 hw/i386/acpi-build: Add ACPI PCI hot-plug methods to Q35
Implement notifications and gpe to support q35 ACPI PCI hot-plug.
Use 0xcc4 - 0xcd7 range for 'acpi-pci-hotplug' io ports.

Signed-off-by: Julia Suvorova <jusual@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Message-Id: <20210713004205.775386-2-jusual@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-16 04:33:34 -04:00
Philippe Mathieu-Daudé
b5b318608e hw/i386: Introduce X86_FW_OVMF Kconfig symbol
Introduce the X86_FW_OVMF Kconfig symbol for OVMF-specific code.
Move the OVMF-specific code from pc_sysfw.c to pc_sysfw_ovmf.c,
adding a pair of stubs.
Update MAINTAINERS to reach OVMF maintainers when these new
files are modified.

This fixes when building the microvm machine standalone:

  /usr/bin/ld: libqemu-i386-softmmu.fa.p/target_i386_monitor.c.o: in
  function `qmp_sev_inject_launch_secret':
  target/i386/monitor.c:749: undefined reference to `pc_system_ovmf_table_find'

Fixes: f522cef9b3 ("sev: update sev-inject-launch-secret to make gpa optional")
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20210616204328.2611406-22-philmd@redhat.com>
2021-07-14 22:28:58 +02:00
Dov Murik
2165542c8d hw/i386/pc: Document pc_system_ovmf_table_find
Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210701052749.934744-3-dovmurik@linux.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-07-14 22:28:58 +02:00
Dov Murik
35ebc321b4 hw/i386/pc: pc_system_ovmf_table_find: Assert that flash was parsed
Add assertion in pc_system_ovmf_table_find that verifies that the flash
was indeed previously parsed (looking for the OVMF table) by
pc_system_parse_ovmf_flash.

Now pc_system_ovmf_table_find distinguishes between "no one called
pc_system_parse_ovmf_flash" (which will abort due to assertion failure)
and "the flash was parsed but no OVMF table was found, or it is invalid"
(which will return false).

Suggested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210701052749.934744-2-dovmurik@linux.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-07-14 22:28:58 +02:00
Michael Roth
a7a0da844d target/i386: suppress CPUID leaves not defined by the CPU vendor
Currently all built-in CPUs report cache information via CPUID leaves 2
and 4, but these have never been defined for AMD. In the case of
SEV-SNP this can cause issues with CPUID enforcement. Address this by
allowing CPU types to suppress these via a new "x-vendor-cpuid-only"
CPU property, which is true by default, but switched off for older
machine types to maintain compatibility.

Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: zhenwei pi <pizhenwei@bytedance.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Message-Id: <20210708003623.18665-1-michael.roth@amd.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-07-13 09:13:29 -04:00
Igor Mammedov
7193d7cdd9 acpi: pc: revert back to v5.2 PCI slot enumeration
Commit [1] moved _SUN variable from only hot-pluggable to
all devices. This made linux kernel enumerate extra slots
that weren't present before. If extra slot happens to be
be enumerated first and there is a device in th same slot
but on other bridge, linux kernel will add -N suffix to
slot name of the later, thus changing NIC name compared to
QEMU 5.2. This in some case confuses systemd, if it is
using SLOT NIC naming scheme and interface name becomes
not the same as it was under QEMU-5.2.

Reproducer QEMU CLI:
  -M pc-i440fx-5.2 -nodefaults \
  -device pci-bridge,chassis_nr=1,id=pci.1,bus=pci.0,addr=0x3 \
  -device virtio-net-pci,id=nic1,bus=pci.1,addr=0x1 \
  -device virtio-net-pci,id=nic2,bus=pci.1,addr=0x2 \
  -device virtio-net-pci,id=nic3,bus=pci.1,addr=0x3

with RHEL8 guest produces following results:
  v5.2:
     kernel: virtio_net virtio0 ens1: renamed from eth0
     kernel: virtio_net virtio2 ens3: renamed from eth2
     kernel: virtio_net virtio1 enp1s2: renamed from eth1
      (slot 2 is assigned to empty bus 0 slot and virtio1
       is assigned to 2-2 slot, and renaming falls back,
       for some reason, to path based naming scheme)

  v6.0:
     kernel: virtio_net virtio0 ens1: renamed from eth0
     kernel: virtio_net virtio2 ens3: renamed from eth2
     systemd-udevd[299]: Error changing net interface name 'eth1' to 'ens3': File exists
     systemd-udevd[299]: could not rename interface '3' from 'eth1' to 'ens3': File exists
      (with commit [1] kernel assigns virtio2 to 3-2 slot
       since bridge advertises _SUN=0x3 and kernel assigns
       slot 3 to bridge. Still it manages to rename virtio2
       correctly to ens3, however systemd gets confused with virtio1
       where slot allocation exactly the same (2-2) as in 5.2 case
       and tries to rename it to ens3 which is rightfully taken by
       virtio2)

I'm not sure what breaks in systemd interface renaming (it probably
should be investigated), but on QEMU side we can safely revert
_SUN to 5.2 behavior (i.e. avoid cold-plugged bridges and non
hot-pluggable device classes), without breaking acpi-index, which uses
slot numbers but it doesn't have to use _SUN, it could use an arbitrary
variable name that has the same slot value).
It will help existing VMs to keep networking with non trivial
configs in working order since systemd will do its interface
renaming magic as it used to do.

1)
Fixes: b7f23f62e4 (pci: acpi: add _DSM method to PCI devices)
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210624204229.998824-3-imammedo@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: John Sucaet <john.sucaet@ekinops.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-07-03 03:12:35 -04:00
Peter Maydell
6512fa497c * Some Meson test conversions
* KVM dirty page ring buffer fix
 * KVM TSC scaling support
 * Fixes for SG_IO with /dev/sdX devices
 * (Non)support for host devices on iOS
 * -smp cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmDV5TIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNySgf9HMnAtLWp36p2ie74o4rrW9x3Ojrm
 fuCq2i3q3nBhEKqqiyp+QQJGubE44mXEZQYtX89tOfSFgg7o6SLIoAcQQskr+In6
 f9I1jjpSVTls0AaGUO+iRn9KiTzeMWeo1l6Wht+2mfBL5XpNLaLLu/T49uPhjlvN
 zFi5blgILxIYMqMCD1joDBnIiqqDozr0p7QzRZD8re25sRhg0NHQxyIh3OxBPpJ9
 3Jhy1Us0cDWrwvPbxz6S5N0zesLu1ojtojVPy6iKjyHSv+6eiE6bHyIbS8duG5+H
 zBC1THOsUV3X1UvPAjuSNlgfNeobGAzmxSJ/evLgWWkpkx1mLtsnL5RARQ==
 =YoOL
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging

* Some Meson test conversions
* KVM dirty page ring buffer fix
* KVM TSC scaling support
* Fixes for SG_IO with /dev/sdX devices
* (Non)support for host devices on iOS
* -smp cleanups

# gpg: Signature made Fri 25 Jun 2021 15:16:18 BST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini-gitlab/tags/for-upstream: (28 commits)
  machine: reject -smp dies!=1 for non-PC machines
  machine: pass QAPI struct to mc->smp_parse
  machine: add error propagation to mc->smp_parse
  machine: move common smp_parse code to caller
  machine: move dies from X86MachineState to CpuTopology
  file-posix: handle EINTR during ioctl
  block: detect DKIOCGETBLOCKCOUNT/SIZE before use
  block: try BSD disk size ioctls one after another
  block: check for sys/disk.h
  block: feature detection for host block support
  file-posix: try BLKSECTGET on block devices too, do not round to power of 2
  block: add max_hw_transfer to BlockLimits
  block-backend: align max_transfer to request alignment
  osdep: provide ROUND_DOWN macro
  scsi-generic: pass max_segments via max_iov field in BlockLimits
  file-posix: fix max_iov for /dev/sg devices
  KVM: Fix dirty ring mmap incorrect size due to renaming accident
  configure, meson: convert libusbredir detection to meson
  configure, meson: convert libcacard detection to meson
  configure, meson: convert libusb detection to meson
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-28 21:04:22 +01:00
Paolo Bonzini
1e63fe6858 machine: pass QAPI struct to mc->smp_parse
As part of converting -smp to a property with a QAPI type, define
the struct and use it to do the actual parsing.  machine_smp_parse
takes care of doing the QemuOpts->QAPI conversion by hand, for now.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210617155308.928754-10-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 16:16:11 +02:00
Paolo Bonzini
abc2f51144 machine: add error propagation to mc->smp_parse
Clean up the smp_parse functions to use Error** instead of exiting.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210617155308.928754-9-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 16:13:50 +02:00
Paolo Bonzini
593d3c5148 machine: move common smp_parse code to caller
Most of smp_parse and pc_smp_parse is guarded by an "if (opts)"
conditional, and the rest is common to both function.  Move the
conditional and the common code to the caller, machine_smp_parse.

Move the replay_add_blocker call after all errors are checked for.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210617155308.928754-8-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 16:13:49 +02:00
Paolo Bonzini
67872eb8ed machine: move dies from X86MachineState to CpuTopology
In order to make SMP configuration a Machine property, we need a getter as
well as a setter.  To simplify the implementation put everything that the
getter needs in the CpuTopology struct.

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20210617155308.928754-7-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-25 16:13:48 +02:00
Philippe Mathieu-Daudé
72ea60e411 hw/block/fdc: Extract ISA floppy controllers to fdc-isa.c
Some machines use floppy controllers via the SysBus interface,
and don't need to pull in all the ISA code.
Extract the ISA specific code to a new unit: fdc-isa.c, and
add a new Kconfig symbol: "FDC_ISA".
This allows us to remove the FIXME from commit dd0ff8191a
("isa: express SuperIO dependencies with Kconfig").

Reviewed-by: John Snow <jsnow@redhat.com>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 20210614193220.2007159-5-philmd@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
2021-06-25 08:53:28 -04:00
Chenyi Qiang
035d1ef265 i386: Add ratelimit for bus locks acquired in guest
A bus lock is acquired through either split locked access to writeback
(WB) memory or any locked access to non-WB memory. It is typically >1000
cycles slower than an atomic operation within a cache and can also
disrupts performance on other cores.

Virtual Machines can exploit bus locks to degrade the performance of
system. To address this kind of performance DOS attack coming from the
VMs, bus lock VM exit is introduced in KVM and it can report the bus
locks detected in guest. If enabled in KVM, it would exit to the
userspace to let the user enforce throttling policies once bus locks
acquired in VMs.

The availability of bus lock VM exit can be detected through the
KVM_CAP_X86_BUS_LOCK_EXIT. The returned bitmap contains the potential
policies supported by KVM. The field KVM_BUS_LOCK_DETECTION_EXIT in
bitmap is the only supported strategy at present. It indicates that KVM
will exit to userspace to handle the bus locks.

This patch adds a ratelimit on the bus locks acquired in guest as a
mitigation policy.

Introduce a new field "bus_lock_ratelimit" to record the limited speed
of bus locks in the target VM. The user can specify it through the
"bus-lock-ratelimit" as a machine property. In current implementation,
the default value of the speed is 0 per second, which means no
restrictions on the bus locks.

As for ratelimit on detected bus locks, simply set the ratelimit
interval to 1s and restrict the quota of bus lock occurence to the value
of "bus_lock_ratelimit". A potential alternative is to introduce the
time slice as a property which can help the user achieve more precise
control.

The detail of bus lock VM exit can be found in spec:
https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Signed-off-by: Chenyi Qiang <chenyi.qiang@intel.com>
Message-Id: <20210521043820.29678-1-chenyi.qiang@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-06-17 14:11:06 -04:00
Stefan Berger
11fb99e6f4 i386: Eliminate all TPM related code if CONFIG_TPM is not set
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210614191335.1968807-2-stefanb@linux.ibm.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2021-06-15 10:54:46 -04:00
Philippe Mathieu-Daudé
585190902a misc: Correct relative include path
Headers should be included from the 'include/' directory,
not from the root directory.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Message-Id: <20210516205034.694788-1-f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-06-05 21:10:42 +02:00
Dmitry Voronetskiy
d84451d38e i386/kvm: The value passed to strerror should be positive
Signed-off-by: Dmitry Voronetskiy <vda1999@yandex.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20210519113528.12474-1-davoronetskiy@gmail.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-06-05 21:01:17 +02:00
Peter Maydell
8c345b3e6a * Update the references to some doc files (use *.rst instead of *.txt)
* Bump minimum versions of some requirements after removing CentOS 7 support
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmC3L1IRHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbU8wA//aRAIn0Qa3xoZdsT434P99GHauCZ5ePq3
 meY4co69c+TfkQZ/b0xlyvT3+7bd9Ni92CQL7n/LtX5bs3pIhRjHNIsm50e4x6Da
 0jb422cKWfRIltXlCfdL/dnFRtJluH83M0seRGMvmhveuWPZ19oSIasEiQjeg//e
 KMNK5z4HEnJ1czb3Bf8p38bmmY/O/QEAA5wAqk7iJkJHz6T/GlqImLYYwpFPlHj6
 JQttm0aWsHrsqEXxnuV0/DT1yHyXDB6S4iuAvABZWhv/M/nCaXo0ib0gW5NPtRPo
 Yf3HO163F9/fewJCc1AUsBe1C/2UwmSWRhEtxpr9uuW2Mv9qEl3hkJwd4k6sEOvh
 U4i+GONC4eElPcmECKUfHA9EP+7faDs6xnM6Ev/PIEp+cPJ2QRfklZv4qpMUWYtb
 3KkADchOyVZAsdB8cGmnznDEVmno1Dt0adVUq8CF6uW6MwD3pb4838arrfwIfwOp
 g4yTI1AQQykkxxOaR7GSNoxWRti7TH4fzLhx/xXDd9TKIOWuOepiyuhB7+Q48WAJ
 6EW/JUIzOe7k3GsI0iBsk+y67ED2tpATiRKWWw4QS1BwIhNEMUqgZkucFfKE6Ze1
 B1Xw+Di+1CzrLHV6lCjtdLVmqWqcGcO43HqhSK4sodE87BDGGwy9UxE6oBVt2C1x
 HPQ6/omDlFk=
 =RAkI
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/thuth-gitlab/tags/pull-request-2021-06-02' into staging

* Update the references to some doc files (use *.rst instead of *.txt)
* Bump minimum versions of some requirements after removing CentOS 7 support

# gpg: Signature made Wed 02 Jun 2021 08:12:18 BST
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* remotes/thuth-gitlab/tags/pull-request-2021-06-02:
  configure: bump min required CLang to 6.0 / XCode 10.0
  configure: bump min required GCC to 7.5.0
  configure: bump min required glib version to 2.56
  tests/docker: drop CentOS 7 container
  tests/vm: convert centos VM recipe to CentOS 8
  crypto: drop used conditional check
  crypto: bump min gnutls to 3.5.18, dropping RHEL-7 support
  crypto: bump min gcrypt to 1.8.0, dropping RHEL-7 support
  crypto: drop back compatibility typedefs for nettle
  crypto: bump min nettle to 3.4, dropping RHEL-7 support
  patchew: move quick build job from CentOS 7 to CentOS 8 container
  block/ssh: Bump minimum libssh version to 0.8.7
  docs: fix references to docs/devel/s390-dasd-ipl.rst
  docs: fix references to docs/specs/tpm.rst
  docs: fix references to docs/devel/build-system.rst
  docs: fix references to docs/devel/atomics.rst
  docs: fix references to docs/devel/tracing.rst

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-06-02 17:08:11 +01:00
Stefano Garzarella
d0fb9657a3 docs: fix references to docs/devel/tracing.rst
Commit e50caf4a5c ("tracing: convert documentation to rST")
converted docs/devel/tracing.txt to docs/devel/tracing.rst.

We still have several references to the old file, so let's fix them
with the following command:

  sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt)

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210517151702.109066-2-sgarzare@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2021-06-02 06:51:09 +02:00
Daniel P. Berrangé
b7c290177c i386: use better matching family/model/stepping for 'qemu64' CPU
The 'qemu64' CPUID currently reports a family/model/stepping that
approximately corresponds to an AMD K7 vintage architecture.
The K7 series predates the introduction of 64-bit support by AMD
in the K8 series. This has been reported to lead to LLVM complaints
about generating 64-bit code for a 32-bit CPU target

  LLVM ERROR: 64-bit code requested on a subtarget that doesn't support it!

It appears LLVM looks at the family/model/stepping, despite qemu64
reporting it is 64-bit capable.

This patch changes 'qemu64' to report a CPUID with the family, model
and stepping taken from a

 AMD Athlon(tm) 64 X2 Dual Core Processor 4000+

which is one of the first 64-bit AMD CPUs.

Closes https://gitlab.com/qemu-project/qemu/-/issues/191

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20210507133650.645526-2-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2021-05-31 15:53:03 -04:00
Philippe Mathieu-Daudé
cfa1f4bcee hw/mem/nvdimm: Use Kconfig 'imply' instead of 'depends on'
Per the kconfig.rst:

  A device should be listed [...] ``imply`` if (depending on
  the QEMU command line) the board may or  may not be started
  without it.

This is the case with the NVDIMM device, so use the 'imply'
weak reverse dependency to select the symbol.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210511155354.3069141-2-philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-26 14:49:45 +02:00
Peter Maydell
6005ee07c3 pc,pci,virtio: bugfixes, improvements
Fixes all over the place. Faster boot for virtio. ioeventfd support for
 mmio.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmCeiMEPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpqsIH/A49Av5Bv8huL75lf9GzCx3E1a/z2W9Fphik
 OcQ1ahR+7CRDARub+vTG40MBmZBVefIWjLAj3BwBWzFGPX0DZq0zeI102VzlEVKY
 OeUx8ixuiKOSLcS+QxE7ZXIBL2Pn7l+MFUi4nLMYKti7c/kola7zlB57qsmXh+VD
 AOQ7Utj6NWoi6QocWJsMSCyHCh3Fk9QzcStLlr6/MkSJa1zqv8l22+8oWH07Fk2M
 wZfhrm9k094on28iSejsFYL5e4ROeXUajbOdfyMIxWvAB7boC9Jxk/e0oAbuSB4y
 2f71Gfk3mU6irS7PvrxcKbk6BVD2zxM2WumOchZJgxFAujDO6yg=
 =fvkT
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc,pci,virtio: bugfixes, improvements

Fixes all over the place. Faster boot for virtio. ioeventfd support for
mmio.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Fri 14 May 2021 15:27:13 BST
# gpg:                using RSA key 5D09FD0871C8F85B94CA8A0D281F0DB8D28D5469
# gpg:                issuer "mst@redhat.com"
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>" [full]
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>" [full]
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  Fix build with 64 bits time_t
  vhost-vdpa: Make vhost_vdpa_get_device_id() static
  hw/virtio: enable ioeventfd configuring for mmio
  hw/smbios: support for type 41 (onboard devices extended information)
  checkpatch: Fix use of uninitialized value
  virtio-scsi: Configure all host notifiers in a single MR transaction
  virtio-scsi: Set host notifiers and callbacks separately
  virtio-blk: Configure all host notifiers in a single MR transaction
  virtio-blk: Fix rollback path in virtio_blk_data_plane_start()
  pc-dimm: remove unnecessary get_vmstate_memory_region() method
  amd_iommu: fix wrong MMIO operations
  virtio-net: Constify VirtIOFeature feature_sizes[]
  virtio-blk: Constify VirtIOFeature feature_sizes[]
  hw/virtio: Pass virtio_feature_get_config_size() a const argument
  x86: acpi: use offset instead of pointer when using build_header()
  amd_iommu: Fix pte_override_page_mask()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	hw/arm/virt.c
2021-05-16 17:22:46 +01:00
Vincent Bernat
05dfb447a4 hw/smbios: support for type 41 (onboard devices extended information)
Type 41 defines the attributes of devices that are onboard. The
original intent was to imply the BIOS had some level of control over
the enablement of the associated devices.

If network devices are present in this table, by default, udev will
name the corresponding interfaces enoX, X being the instance number.
Without such information, udev will fallback to using the PCI ID and
this usually gives ens3 or ens4. This can be a bit annoying as the
name of the network card may depend on the order of options and may
change if a new PCI device is added earlier on the commande line.
Being able to provide SMBIOS type 41 entry ensure the name of the
interface won't change and helps the user guess the right name without
booting a first time.

This can be invoked with:

    $QEMU -netdev user,id=internet
          -device virtio-net-pci,mac=50:54:00:00:00:42,netdev=internet,id=internet-dev \
          -smbios type=41,designation='Onboard LAN',instance=1,kind=ethernet,pcidev=internet-dev

The PCI segment is assumed to be 0. This should hold true for most
cases.

    $ dmidecode -t 41
    # dmidecode 3.3
    Getting SMBIOS data from sysfs.
    SMBIOS 2.8 present.

    Handle 0x2900, DMI type 41, 11 bytes
    Onboard Device
            Reference Designation: Onboard LAN
            Type: Ethernet
            Status: Enabled
            Type Instance: 1
            Bus Address: 0000:00:09.0

    $ ip -brief a
    lo               UNKNOWN        127.0.0.1/8 ::1/128
    eno1             UP             10.0.2.14/24 fec0::5254:ff:fe00:42/64 fe80::5254:ff:fe00:42/64

Signed-off-by: Vincent Bernat <vincent@bernat.ch>
Message-Id: <20210401171138.62970-1-vincent@bernat.ch>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-05-14 10:26:18 -04:00
Roman Kapl
e526ab61e9 amd_iommu: fix wrong MMIO operations
Address was swapped with value when writing MMIO registers, so the user
saw garbage in lot of cases. The interrupt status was not correctly set.

Signed-off-by: Roman Kapl <rka@sysgo.com>
Message-Id: <20210427110504.10878-1-rka@sysgo.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-05-14 10:26:18 -04:00
David Hildenbrand
8f44304c76 numa: Teach ram block notifiers about resizeable ram blocks
Ram block notifiers are currently not aware of resizes. To properly
handle resizes during migration, we want to teach ram block notifiers about
resizeable ram.

Introduce the basic infrastructure but keep using max_size in the
existing notifiers. Supply the max_size when adding and removing ram
blocks. Also, notify on resizes.

Acked-by: Paul Durrant <paul@xen.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Cc: xen-devel@lists.xenproject.org
Cc: haxm-team@intel.com
Cc: Paul Durrant <paul@xen.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Wenchao Wang <wenchao.wang@intel.com>
Cc: Colin Xu <colin.xu@intel.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210429112708.12291-3-david@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2021-05-13 18:21:13 +01:00
Peter Maydell
31589644ba * AccelCPUClass and sysemu/user split for i386 (Claudio)
* i386 page walk unification
 * Fix detection of gdbus-codegen
 * Misc refactoring
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmCblEEUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroObnQgAj10pRDODY9hIiUiYj2sTcEQTly3p
 DC+ZWDaup67z3WV2C/vAS/x31RGIus+7bzji3fgtUcnGOr7sbuOCcs7CPY8mam5Y
 GMPNrsUk2sZ5z9SVTq2vjEa61tjxtMpYXx9pnhgJzJAO4NJzNuX74ZdpA+oV5aTC
 CvZDk8lC7BLU16MfeLcbw44xE4Oy05wWwaoP2pvhdOg47y85t/S9Il1yBCYi3y8C
 pTOBBCYmHGPj/r7i4MhUGrAjIyGQu1w7av8nZXouRegoeVl28RKR8+pl7TfFpt+E
 cp95yE8dPuNFJnCiZ3Kv01eBcSUcyp4gVb5H2Oa/nkkYRLpnONbUzYFtIA==
 =zR5U
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini-gitlab/tags/for-upstream' into staging

* AccelCPUClass and sysemu/user split for i386 (Claudio)
* i386 page walk unification
* Fix detection of gdbus-codegen
* Misc refactoring

# gpg: Signature made Wed 12 May 2021 09:39:29 BST
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini-gitlab/tags/for-upstream: (32 commits)
  coverity-scan: list components, move model to scripts/coverity-scan
  configure: fix detection of gdbus-codegen
  qemu-option: support accept-any QemuOptsList in qemu_opts_absorb_qdict
  main-loop: remove dead code
  target/i386: use mmu_translate for NPT walk
  target/i386: allow customizing the next phase of the translation
  target/i386: extend pg_mode to more CR0 and CR4 bits
  target/i386: pass cr3 to mmu_translate
  target/i386: extract mmu_translate
  target/i386: move paging mode constants from SVM to cpu.h
  target/i386: merge SVM_NPTEXIT_* with PF_ERROR_* constants
  accel: add init_accel_cpu for adapting accel behavior to CPU type
  accel: move call to accel_init_interfaces
  i386: make cpu_load_efer sysemu-only
  target/i386: gdbstub: only write CR0/CR2/CR3/EFER for sysemu
  target/i386: gdbstub: introduce aux functions to read/write CS64 regs
  i386: split off sysemu part of cpu.c
  i386: split seg_helper into user-only and sysemu parts
  i386: split svm_helper into sysemu and stub-only user
  i386: separate fpu_helper sysemu-only parts
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-05-12 16:07:50 +01:00
Claudio Fontana
f5cc5a5c16 i386: split cpu accelerators from cpu.c, using AccelCPUClass
i386 is the first user of AccelCPUClass, allowing to split
cpu.c into:

cpu.c            cpuid and common x86 cpu functionality
host-cpu.c       host x86 cpu functions and "host" cpu type
kvm/kvm-cpu.c    KVM x86 AccelCPUClass
hvf/hvf-cpu.c    HVF x86 AccelCPUClass
tcg/tcg-cpu.c    TCG x86 AccelCPUClass

Signed-off-by: Claudio Fontana <cfontana@suse.de>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

[claudio]:
Rebased on commit b8184135 ("target/i386: allow modifying TCG phys-addr-bits")

Signed-off-by: Claudio Fontana <cfontana@suse.de>
Message-Id: <20210322132800.7470-5-cfontana@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-05-10 15:41:49 -04:00
Anthony PERARD
f1e43b6026 xen: Free xenforeignmemory_resource at exit
Because Coverity complains about it and this is one leak that Valgrind
reports.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Paul Durrant <paul@xen.org>
Message-Id: <20210430163742.469739-1-anthony.perard@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2021-05-10 13:43:58 +01:00
Igor Druzhinin
3e81a71c9f xen-mapcache: avoid a race on memory map while using MAP_FIXED
When we're replacing the existing mapping there is possibility of a race
on memory map with other threads doing mmap operations - the address being
unmapped/re-mapped could be occupied by another thread in between.

Linux mmap man page recommends keeping the existing mappings in place to
reserve the place and instead utilize the fact that the next mmap operation
with MAP_FIXED flag passed will implicitly destroy the existing mappings
behind the chosen address. This behavior is guaranteed by POSIX / BSD and
therefore is portable.

Note that it wouldn't make the replacement atomic for parallel accesses to
the replaced region - those might still fail with SIGBUS due to
xenforeignmemory_map not being atomic. So we're still not expecting those.

Tested-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Igor Druzhinin <igor.druzhinin@citrix.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-Id: <1618889702-13104-1-git-send-email-igor.druzhinin@citrix.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
2021-05-10 13:43:58 +01:00
Igor Mammedov
bb9feea431 x86: acpi: use offset instead of pointer when using build_header()
Do the same as in commit
 (4d027afeb3 Virt: ACPI: fix qemu assert due to re-assigned table data address)
for remaining tables that happen to use saved at
the beginning pointer to build header to avoid assert
when table_data is relocated due to implicit re-size.

In this case user is trying to start Windows 10 and getting assert at
 hw/acpi/bios-linker-loader.c:239:
  bios_linker_loader_add_checksum: Assertion `start_offset < file->blob->len' failed.

Fixes: https://bugs.launchpad.net/bugs/1923497
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210414084356.3792113-1-imammedo@redhat.com>
Cc: mst@redhat.com, qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-05-04 07:27:37 -04:00
Jean-Philippe Brucker
5d31e1e59a amd_iommu: Fix pte_override_page_mask()
AMD IOMMU PTEs have a special mode allowing to specify an arbitrary page
size. Quoting the AMD IOMMU specification: "When the Next Level bits [of
a pte] are 7h, the size of the page is determined by the first zero bit
in the page address, starting from bit 12."

So if the lowest bits of the page address is 0, the page is 8kB. If the
lowest bits are 011, the page is 32kB. Currently pte_override_page_mask()
doesn't compute the right value for this page size and amdvi_translate()
can return the wrong guest-physical address. With a Linux guest, DMA
from SATA devices accesses the wrong memory and causes probe failure:

qemu-system-x86_64 ... -device amd-iommu -drive id=hd1,file=foo.bin,if=none \
		-device ahci,id=ahci -device ide-hd,drive=hd1,bus=ahci.0
[    6.613093] ata1.00: qc timeout (cmd 0xec)
[    6.615062] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)

Fix the page mask.

Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Message-Id: <20210421084007.1190546-1-jean-philippe@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-05-04 07:27:37 -04:00
Thomas Huth
ee86213aa3 Do not include exec/address-spaces.h if it's not really necessary
Stop including exec/address-spaces.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-5-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:51 +02:00
Thomas Huth
2068cabd3f Do not include cpu.h if it's not really necessary
Stop including cpu.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-4-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:51 +02:00
Thomas Huth
ead62c75f6 Do not include hw/boards.h if it's not really necessary
Stop including hw/boards.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-3-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:51 +02:00
Thomas Huth
4c386f8064 Do not include sysemu/sysemu.h if it's not really necessary
Stop including sysemu/sysemu.h in files that don't need it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20210416171314.2074665-2-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:50 +02:00
Thomas Huth
e924921f5c hw: Do not include hw/irq.h if it is not necessary
Many files include hw/irq.h without needing it. Remove the superfluous
include statements.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20210327050236.2232347-1-thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2021-05-02 17:24:50 +02:00
Cornelia Huck
da7e13c00b hw: add compat machines for 6.1
Add 6.1 machine types for arm/i440fx/q35/s390x/spapr.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Message-id: 20210331111900.118274-1-cohuck@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2021-04-30 11:16:51 +01:00
Marian Postevca
d07b22863b acpi: Move setters/getters of oem fields to X86MachineState
The code that sets/gets oem fields is duplicated in both PC and MICROVM
variants. This commit moves it to X86MachineState so that all x86
variants can use it and duplication is removed.

Signed-off-by: Marian Postevca <posteuca@mutex.one>
Message-Id: <20210221001737.24499-2-posteuca@mutex.one>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
David Hildenbrand
6930ba0d44 acpi: Move maximum size logic into acpi_add_rom_blob()
We want to have safety margins for all tables based on the table type.
Let's move the maximum size logic into acpi_add_rom_blob() and make it
dependent on the table name, so we don't have to replicate for each and
every instance that creates such tables.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-4-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
David Hildenbrand
2a3bdc5cec microvm: Don't open-code "etc/table-loader"
Let's just reuse ACPI_BUILD_LOADER_FILE.

Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-3-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
David Hildenbrand
6c2b24d1d2 acpi: Set proper maximum size for "etc/table-loader" blob
The resizeable memory region / RAMBlock that is created for the cmd blob
has a maximum size of whole host pages (e.g., 4k), because RAMBlocks
work on full host pages. In addition, in i386 ACPI code:
  acpi_align_size(tables->linker->cmd_blob, ACPI_BUILD_ALIGN_SIZE);
makes sure to align to multiples of 4k, padding with 0.

For example, if our cmd_blob is created with a size of 2k, the maximum
size is 4k - we cannot grow beyond that. Growing might be required
due to guest action when rebuilding the tables, but also on incoming
migration.

This automatic generation of the maximum size used to be sufficient,
however, there are cases where we cross host pages now when growing at
runtime: we exceed the maximum size of the RAMBlock and can crash QEMU when
trying to resize the resizeable memory region / RAMBlock:
  $ build/qemu-system-x86_64 --enable-kvm \
      -machine q35,nvdimm=on \
      -smp 1 \
      -cpu host \
      -m size=2G,slots=8,maxmem=4G \
      -object memory-backend-file,id=mem0,mem-path=/tmp/nvdimm,size=256M \
      -device nvdimm,label-size=131072,memdev=mem0,id=nvdimm0,slot=1 \
      -nodefaults \
      -device vmgenid \
      -device intel-iommu

Results in:
  Unexpected error in qemu_ram_resize() at ../softmmu/physmem.c:1850:
  qemu-system-x86_64: Size too large: /rom@etc/table-loader:
    0x2000 > 0x1000: Invalid argument

In this configuration, we consume exactly 4k (32 entries, 128 bytes each)
when creating the VM. However, once the guest boots up and maps the MCFG,
we also create the MCFG table and end up consuming 2 additional entries
(pointer + checksum) -- which is where we try resizing the memory region
/ RAMBlock, however, the maximum size does not allow for it.

Currently, we get the following maximum sizes for our different
mutable tables based on behavior of resizeable RAMBlock:

  hw       table                max_size
  -------  ---------------------------------------------------------

  virt     "etc/acpi/tables"    ACPI_BUILD_TABLE_MAX_SIZE (0x200000)
  virt     "etc/table-loader"   HOST_PAGE_ALIGN(initial_size)
  virt     "etc/acpi/rsdp"      HOST_PAGE_ALIGN(initial_size)

  i386     "etc/acpi/tables"    ACPI_BUILD_TABLE_MAX_SIZE (0x200000)
  i386     "etc/table-loader"   HOST_PAGE_ALIGN(initial_size)
  i386     "etc/acpi/rsdp"      HOST_PAGE_ALIGN(initial_size)

  microvm  "etc/acpi/tables"    ACPI_BUILD_TABLE_MAX_SIZE (0x200000)
  microvm  "etc/table-loader"   HOST_PAGE_ALIGN(initial_size)
  microvm  "etc/acpi/rsdp"      HOST_PAGE_ALIGN(initial_size)

Let's set the maximum table size for "etc/table-loader" to 64k, so we
can properly grow at runtime, which should be good enough for the future.

Migration is not concerned with the maximum size of a RAMBlock, only
with the used size - so existing setups are not affected. Of course, we
cannot migrate a VM that would have crash when started on older QEMU from
new QEMU to older QEMU without failing early on the destination when
synchronizing the RAM state:
    qemu-system-x86_64: Size too large: /rom@etc/table-loader: 0x2000 > 0x1000: Invalid argument
    qemu-system-x86_64: error while loading state for instance 0x0 of device 'ram'
    qemu-system-x86_64: load of migration failed: Invalid argument

We'll refactor the code next, to make sure we get rid of this implicit
behavior for "etc/acpi/rsdp" as well and to make the code easier to
grasp.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <shannon.zhaosl@gmail.com>
Cc: Marcel Apfelbaum <marcel.apfelbaum@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20210304105554.121674-2-david@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
Igor Mammedov
b7f23f62e4 pci: acpi: add _DSM method to PCI devices
Implement _DSM according to:
    PCI Firmware Specification 3.1
    4.6.7.  DSM for Naming a PCI or PCI Express Device Under
            Operating Systems
and wire it up to cold and hot-plugged PCI devices.
Feature depends on ACPI hotplug being enabled (as that provides
PCI devices descriptions in ACPI and MMIO registers that are
reused to fetch acpi-index).

acpi-index should work for
  - cold plugged NICs:
      $QEMU -device e1000,acpi-index=100
         => 'eno100'
  - hot-plugged
      (monitor) device_add e1000,acpi-index=200,id=remove_me
         => 'eno200'
  - re-plugged
      (monitor) device_del remove_me
      (monitor) device_add e1000,acpi-index=1
         => 'eno1'

Windows also sees index under "PCI Label Id" field in properties
dialog but otherwise it doesn't seem to have any effect.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-6-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00
Igor Mammedov
b32bd763a1 pci: introduce acpi-index property for PCI device
In x86/ACPI world, linux distros are using predictable
network interface naming since systemd v197. Which on
QEMU based VMs results into path based naming scheme,
that names network interfaces based on PCI topology.

With itm on has to plug NIC in exactly the same bus/slot,
which was used when disk image was first provisioned/configured
or one risks to loose network configuration due to NIC being
renamed to actually used topology.
That also restricts freedom to reshape PCI configuration of
VM without need to reconfigure used guest image.

systemd also offers "onboard" naming scheme which is
preferred over PCI slot/topology one, provided that
firmware implements:
    "
    PCI Firmware Specification 3.1
    4.6.7.  DSM for Naming a PCI or PCI Express Device Under
            Operating Systems
    "
that allows to assign user defined index to PCI device,
which systemd will use to name NIC. For example, using
  -device e1000,acpi-index=100
guest will rename NIC to 'eno100', where 'eno' is default
prefix for "onboard" naming scheme. This doesn't require
any advance configuration on guest side to com in effect
at 'onboard' scheme takes priority over path based naming.

Hope is that 'acpi-index' it will be easier to consume by
management layer, compared to forcing specific PCI topology
and/or having several disk image templates for different
topologies and will help to simplify process of spawning
VM from the same template without need to reconfigure
guest NIC.

This patch adds, 'acpi-index'* property and wires up
a 32bit register on top of pci hotplug register block
to pass index value to AML code at runtime.
Following patch will add corresponding _DSM code and
wire it up to PCI devices described in ACPI.

*) name comes from linux kernel terminology

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20210315180102.3008391-3-imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2021-03-22 18:58:19 -04:00