Commit Graph

570 Commits

Author SHA1 Message Date
Cédric Le Goater 0976efd51b spapr_pci: add an extra 'nr_msis' argument to spapr_populate_pci_dt
So that we don't have to call qdev_get_machine() to get the machine
class and the sPAPRIrq backend holding the number of MSIs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-09-25 11:12:25 +10:00
Cédric Le Goater ae83740237 spapr: increase the size of the IRQ number space
The new layout using static IRQ number does not leave much space to
the dynamic MSI range, only 0x100 IRQ numbers. Increase the total
number of IRQS for newer machines and introduce a legacy XICS backend
for pre-3.1 machines to maintain compatibility.

For the old backend, provide a 'nr_msis' value covering the full IRQ
number space as it does not use the bitmap allocator to allocate MSI
interrupt numbers.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-09-25 11:12:25 +10:00
Sam Bobroff ecda255eba spapr: Correct reference count on spapr-cpu-core
spapr_init_cpus() currently creates spapr-cpu-core objects via
object_new() and setting their realized property to true. This leaves
their reference count at two, because object_new() adds an initial
reference and the realization attaches them to a default parent object
which also increments the reference count.

This causes a problem if one of these cores is hot unplugged: no
delete event is generated for it because it's reference count doesn't
reach zero when it is detached from it's parent.

Correct this by adding a call to object_unref() in spapr_init_cpus().

Signed-off-by: Sam Bobroff <sbobroff@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-30 15:58:42 +10:00
Emilio G. Cota eceba3477e spapr: fix leak of rev array
Introduced in 04d595b300 ("spapr: do not use CPU_FOREACH_REVERSE",
2018-08-23)

Fixes: CID1395181
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-28 11:31:23 +10:00
Peter Maydell 3c825bb7c1 * x86 TCG fixes for 64-bit call gates (Andrew)
* qumu-guest-agent freeze-hook tweak (Christian)
 * pm_smbus improvements (Corey)
 * Move validation to pre_plug for pc-dimm (David)
 * Fix memory leaks (Eduardo, Marc-André)
 * synchronization profiler (Emilio)
 * Convert the CPU list to RCU (Emilio)
 * LSI support for PPR Extended Message (George)
 * vhost-scsi support for protection information (Greg)
 * Mark mptsas as a storage device in the help (Guenter)
 * checkpatch tweak cherry-picked from Linux (me)
 * Typos, cleanups and dead-code removal (Julia, Marc-André)
 * qemu-pr-helper support for old libmultipath (Murilo)
 * Annotate fallthroughs (me)
 * MemoryRegionOps cleanup (me, Peter)
 * Make s390 qtests independent from libqos, which doesn't actually support it (me)
 * Make cpu_get_ticks independent from BQL (me)
 * Introspection fixes (Thomas)
 * Support QEMU_MODULE_DIR environment variable (ryang)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAlt+5OYUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPtxwf8CQM/F+0L+EKeYfYcVgVZsDhhOkLj
 Pm61q0bZsWKLby5jCqIDYw7Z/vodJnSS1DO0slIRoXxvQ9DwlkbBnBy/aG/E9U0q
 WF1vbCezibDIt7sGcsu9F5zXU9eqe+E6dZfxFrv8FQSOFVxn34TfeJagWLCtzg0d
 LnVTF/e4zJD8IQiM7w6lJQxua3fz13ssPEg2KnMkguDhACMwvZ/K/cA2AJkHRMhY
 sroPMwLHlrF1NOoeCIrWxYUmSGCRCAy1DmiPGiiSs0yBq/dL0UkAa5Eu6HMQ7rgI
 zUff3JDmzEjixUSIEbpVRN+yPCN0/ACSOpJUrKLDxXbc4nZ+PBQ04YpyPQ==
 =UZiV
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* x86 TCG fixes for 64-bit call gates (Andrew)
* qumu-guest-agent freeze-hook tweak (Christian)
* pm_smbus improvements (Corey)
* Move validation to pre_plug for pc-dimm (David)
* Fix memory leaks (Eduardo, Marc-André)
* synchronization profiler (Emilio)
* Convert the CPU list to RCU (Emilio)
* LSI support for PPR Extended Message (George)
* vhost-scsi support for protection information (Greg)
* Mark mptsas as a storage device in the help (Guenter)
* checkpatch tweak cherry-picked from Linux (me)
* Typos, cleanups and dead-code removal (Julia, Marc-André)
* qemu-pr-helper support for old libmultipath (Murilo)
* Annotate fallthroughs (me)
* MemoryRegionOps cleanup (me, Peter)
* Make s390 qtests independent from libqos, which doesn't actually support it (me)
* Make cpu_get_ticks independent from BQL (me)
* Introspection fixes (Thomas)
* Support QEMU_MODULE_DIR environment variable (ryang)

# gpg: Signature made Thu 23 Aug 2018 17:46:30 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (69 commits)
  KVM: cleanup unnecessary #ifdef KVM_CAP_...
  target/i386: update MPX flags when CPL changes
  i2c: pm_smbus: Add the ability to force block transfer enable
  i2c: pm_smbus: Don't delay host status register busy bit when interrupts are enabled
  i2c: pm_smbus: Add interrupt handling
  i2c: pm_smbus: Add block transfer capability
  i2c: pm_smbus: Make the I2C block read command read-only
  i2c: pm_smbus: Fix the semantics of block I2C transfers
  i2c: pm_smbus: Clean up some style issues
  pc-dimm: assign and verify the "addr" property during pre_plug
  pc: drop memory region alignment check for 0
  util/oslib-win32: indicate alignment for qemu_anon_ram_alloc()
  pc-dimm: assign and verify the "slot" property during pre_plug
  ipmi: Use proper struct reference for BT vmstate
  vhost-scsi: expose 't10_pi' property for VIRTIO_SCSI_F_T10_PI
  vhost-scsi: unify vhost-scsi get_features implementations
  vhost-user-scsi: move host_features into VHostSCSICommon
  cpus: allow cpu_get_ticks out of BQL
  cpus: protect TimerState writes with a spinlock
  seqlock: add QemuLockable support
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-08-23 19:03:54 +01:00
David Hildenbrand b0e624435b pc-dimm: assign and verify the "addr" property during pre_plug
We can assign and verify the address before realizing and trying to plug.
reading/writing the address property should never fail for DIMMs, so let's
reduce error handling a bit by using &error_abort. Getting access to the
memory region now might however fail. So forward errors from
get_memory_region() properly.

As all memory devices should use the alignment of the underlying memory
region for guest physical address asignment, do detection of the
alignment in pc_dimm_pre_plug(), but allow pc.c to overwrite the
alignment for compatibility handling.

Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180801133444.11269-5-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23 18:46:25 +02:00
David Hildenbrand 8f1ffe5be8 pc-dimm: assign and verify the "slot" property during pre_plug
We can assign and verify the slot before realizing and trying to plug.
reading/writing the slot property should never fail, so let's reduce
error handling a bit by using &error_abort.

To do this during pre_plug, add and use (x86, ppc) pc_dimm_pre_plug().

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180801133444.11269-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23 18:46:25 +02:00
Emilio G. Cota 04d595b300 spapr: do not use CPU_FOREACH_REVERSE
This paves the way for implementing the CPU list with an RCU list,
which cannot be traversed in reverse order.

Note that this is the only caller of CPU_FOREACH_REVERSE.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20180819091335.22863-11-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-08-23 18:46:25 +02:00
Cédric Le Goater ef01ed9d19 spapr: introduce a IRQ controller backend to the machine
This proposal moves all the related IRQ routines of the sPAPR machine
behind a sPAPR IRQ backend interface 'spapr_irq' to prepare for future
changes. First of which will be to increase the size of the IRQ number
space, then, will follow a new backend for the POWER9 XIVE IRQ controller.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21 14:28:45 +10:00
Cédric Le Goater 82cffa2eb2 spapr: introduce a fixed IRQ number space
This proposal introduces a new IRQ number space layout using static
numbers for all devices, depending on a device index, and a bitmap
allocator for the MSI IRQ numbers which are negotiated by the guest at
runtime.

As the VIO device model does not have a device index but a "reg"
property, we introduce a formula to compute an IRQ number from a "reg"
value. It should minimize most of the collisions.

The previous layout is kept in pre-3.1 machines raising the
'legacy_irq_allocation' machine class flag.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21 14:28:45 +10:00
Cédric Le Goater d45360d93d spapr: Add a pseries-3.1 machine type
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-08-21 14:28:45 +10:00
Mark Cave-Ayland 907aac2f6a fw_cfg: ignore suffixes in the bootdevice list dependent on machine class
For the older machines (such as Mac and SPARC) the DT nodes representing
bootdevices for disk nodes are irregular for mainly historical reasons.

Since the majority of bootdevice nodes for these machines either do not have a
separate disk node or require different (custom) names then it is much easier
for processing to just disable all suffixes for a particular machine.

Introduce a new ignore_boot_device_suffixes MachineClass property to control
bootdevice suffix generation, defaulting to false in order to preserve
compatibility.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20180810124027.10698-1-mark.cave-ayland@ilande.co.uk>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-08-16 22:27:43 -03:00
David Gibson ccc2cef8b3 spapr: Correct inverted test in spapr_pc_dimm_node()
This function was introduced between v2.11 and v2.12 to replace obsolete
ways of specifying the NUMA nodes for DIMMs.  It's used to find the correct
node for an LMB, by locating which DIMM object it lies within.

Unfortunately, one of the checks is inverted, so we check whether the
address is less than two different things, rather than actually checking
a range.  This introduced a regression, meaning that after a reboot qemu
will advertise incorrect node information for memory to the guest.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2018-07-16 11:18:09 +10:00
Peter Maydell b07cd3e748 ppc patch queue 2018-07-03
Here's a last minue pull request before today's soft freeze.  Ideally
 I would have sent this earlier, but I was waiting for a couple of
 extra fixes I knew were close.  And the freeze crept up on me, like
 always.
 
 Most of the changes here are bugfixes in any case.  There are some
 cleanups as well, which have been in my staging tree for a little
 while.  There are a couple of truly new features (some extensions to
 the sam460ex platform), but these are low risk, since they only affect
 a new and not really stabilized machine type anyway.
 
 Higlights are:
   * Mac platform improvements from Mark Cave-Ayland
   * Sam460ex improvements from BALATON Zoltan et al.
   * XICS interrupt handler cleanups from Cédric Le Goater
   * TCG improvements for atomic loads and stores from Richard
     Henderson
   * Assorted other bugfixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAls7D8oACgkQbDjKyiDZ
 s5Lxmg//YzPfC/nKqTTKkyJPzh/NnSC+kRTMAT3mbxdRIc7yfgMqJtWGGbS1iKgK
 EeJ9hl5Qm0HfscfDuzf0xasU62ZEv3kNdLnWJEIgkqiXrxoO5KCnC0y4D8NN1W03
 mvINNCa8+QDg2OsirGmNUTkriiG3wLIrHTpLZ4+JuC2Bd9H3nTHZgJ0MXON/1VWY
 oRgr6kMZ5+IAzPhvYLFR6l3nPI883fgJOFyRo7YqYrkVBKFrFkfK0Xjw6vpsNxcx
 2dE/YCHhNIriLuBG5noewL7GuqZRtLnl6rjjee5VAKIe1EmFeR+jsXwNjzGOVOJg
 dhjOtsJsQQ3WdEw5uImJzE64kV228WCgmkeXzZd1010JBLr7sUkrd2EuoZ23vvat
 uvZAHVSBrJg5WvzMo1VMEoPU3VeeZQ5HL+MI80iKiU6oUgRK11gVJcebtA0sEKt+
 zhJC4JiUlHtZLTGIpMBmU8DJZ3Tyk1cBEm+Ky+SaPE+dsz16UHI0fazFQXJnXphE
 MLHEGAyQgzWYp7kIcAjUFev0Geq/Uovy4JKIGI6ISop1wRPEQDxkthfkfRyQxQkE
 zuse4EBcEH/Undw9KrmEQa0hCe+8BRkxklVbPesFPPdqH3PKNxtHYuWpSShQF0PW
 XMjw43O2Rbsl8kBUHCpy4pYSugD1hpfgaw/mVUOU1u/M1O6toTw=
 =AHrx
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-3.0-20180703' into staging

ppc patch queue 2018-07-03

Here's a last minue pull request before today's soft freeze.  Ideally
I would have sent this earlier, but I was waiting for a couple of
extra fixes I knew were close.  And the freeze crept up on me, like
always.

Most of the changes here are bugfixes in any case.  There are some
cleanups as well, which have been in my staging tree for a little
while.  There are a couple of truly new features (some extensions to
the sam460ex platform), but these are low risk, since they only affect
a new and not really stabilized machine type anyway.

Higlights are:
  * Mac platform improvements from Mark Cave-Ayland
  * Sam460ex improvements from BALATON Zoltan et al.
  * XICS interrupt handler cleanups from Cédric Le Goater
  * TCG improvements for atomic loads and stores from Richard
    Henderson
  * Assorted other bugfixes

# gpg: Signature made Tue 03 Jul 2018 06:55:22 BST
# gpg:                using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-3.0-20180703: (35 commits)
  ppc: Include vga cirrus card into the compiling process
  target/ppc: Relax reserved bitmask of indexed store instructions
  target/ppc: set is_jmp on ppc_tr_breakpoint_check
  spapr: compute default value of "hpt-max-page-size" later
  target/ppc/kvm: don't pass cpu to kvm_get_smmu_info()
  target/ppc/kvm: get rid of kvm_get_fallback_smmu_info()
  ppc440_uc: Basic emulation of PPC440 DMA controller
  sam460ex: Add RTC device
  hw/timer: Add basic M41T80 emulation
  ppc4xx_i2c: Rewrite to model hardware more closely
  hw/ppc: Give sam46ex its own config option
  fpu_helper.c: fix setting FPSCR[FI] bit
  target/ppc: Implement the rest of gen_st_atomic
  target/ppc: Implement the rest of gen_ld_atomic
  target/ppc: Use atomic min/max helpers
  target/ppc: Use MO_ALIGN for EXIWX and ECOWX
  target/ppc: Split out gen_st_atomic
  target/ppc: Split out gen_ld_atomic
  target/ppc: Split out gen_load_locked
  target/ppc: Tidy gen_conditional_store
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

# Conflicts:
#	hw/ppc/spapr.c
2018-07-03 14:59:27 +01:00
Sebastian Bauer 29f9cef39e ppc: Include vga cirrus card into the compiling process
Drivers for this card exists on PPC-based AmigaOS guests so it is useful to
allow users to emulate the graphics card for PPC machines.

As cirrus vga is currently preferred over std(vga) in absence of any user
choice, this change also sets the default display of spapr machines to
std as otherwise qemu refuses to start these machines. Not specifying an
explicit graphics mode is for instance done by 'make check'.

Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-07-03 11:23:09 +10:00
Greg Kurz e89372951d spapr: compute default value of "hpt-max-page-size" later
It is currently not possible to run a pseries-2.12 or older machine
with HV KVM. QEMU prints the following and exits right away.

qemu-system-ppc64: KVM doesn't support for base page shift 34

The "hpt-max-page-size" capability was recently added to spapr to hide
host configuration details from HPT mode guests. Its default value for
newer machine types is 64k.

For backwards compatibility, pseries-2.12 and older machine types need
a different value. This is handled as usual in a class init function.
The default value is 16G, ie, all page sizes supported by POWER7 and
newer CPUs, but HV KVM requires guest pages to be hpa contiguous as
well as gpa contiguous. The default value is the page size used to
back the guest RAM in this case.

Unfortunately kvmppc_hpt_needs_host_contiguous_pages()->kvm_enabled() is
called way before KVM init and returns false, even if the user requested
KVM. We thus end up selecting 16G, which isn't supported by HV KVM. The
default value must be set during machine init, because we can safely
assume that KVM is initialized at this point.

We fix this by moving the logic to default_caps_with_cpu(). Since the
user cannot pass cap-hpt-max-page-size=0, we set the default to 0 in
the pseries-2.12 class init function and use that as a flag to do the
real work.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-07-03 10:20:15 +10:00
Cédric Le Goater abe82ebb20 ppc/xics: rework the ICS classes inheritance tree
With the previous changes, we can now let the ICS_KVM class inherit
directly from ICS_BASE class and not from the intermediate ICS_SIMPLE.
It makes the class hierarchy much cleaner.

What is left in the top classes is the low level interface to access
the KVM XICS device in ICS_KVM and the XICS emulating handlers in
ICS_SIMPLE.

This should not break migration compatibility.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-07-03 09:56:51 +10:00
Philippe Mathieu-Daudé ab3dd74924 hw/ppc: Use the IEC binary prefix definitions
It eases code review, unit is explicit.

Patch generated using:

  $ git grep -E '(1024|2048|4096|8192|(<<|>>).?(10|20|30))' hw/ include/hw/

and modified manually.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20180625124238.25339-33-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-02 15:41:16 +02:00
Philippe Mathieu-Daudé d23b6caadb hw: Use IEC binary prefix definitions from "qemu/units.h"
Code change produced with:

  $ git ls-files | egrep '\.[ch]$' | \
    xargs sed -i -e 's/\(\W[KMGTPE]\)_BYTE/\1iB/g'

Suggested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts)
Message-Id: <20180625124238.25339-6-f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-07-02 15:41:10 +02:00
David Hildenbrand f0b7bca64d pc-dimm: get_memory_region() will not fail after realize
Let's try to reduce error handling a bit. In the plug/unplug case, the
device was realized and therefore we can assume that getting access to
the memory region will not fail.

For get_vmstate_memory_region() this is already handled that way.
Document both cases.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-13-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:34 +02:00
David Hildenbrand 284878ee98 pc-dimm: rename pc_dimm_memory_* to pc_dimm_*
Let's rename it to make it look more consistent.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180619134141.29478-4-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-28 19:05:33 +02:00
David Gibson 123eec6552 spapr: Use maximum page size capability to simplify memory backend checking
The way we used to handle KVM allowable guest pagesizes for PAPR guests
required some convoluted checking of memory attached to the guest.

The allowable pagesizes advertised to the guest cpus depended on the memory
which was attached at boot, but then we needed to ensure that any memory
later hotplugged didn't change which pagesizes were allowed.

Now that we have an explicit machine option to control the allowable
maximum pagesize we can simplify this.  We just check all memory backends
against that declared pagesize.  We check base and cold-plugged memory at
reset time, and hotplugged memory at pre_plug() time.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-06-22 14:19:07 +10:00
David Gibson 2309832afd spapr: Maximum (HPT) pagesize property
The way the POWER Hash Page Table (HPT) MMU is virtualized by KVM HV means
that every page that the guest puts in the pagetables must be truly
physically contiguous, not just GPA-contiguous.  In effect this means that
an HPT guest can't use any pagesizes greater than the host page size used
to back its memory.

At present we handle this by changing what we advertise to the guest based
on the backing pagesizes.  This is pretty bad, because it means the guest
sees a different environment depending on what should be host configuration
details.

As a start on fixing this, we add a new capability parameter to the
pseries machine type which gives the maximum allowed pagesizes for an
HPT guest.  For now we just create and validate the parameter without
making it do anything.

For backwards compatibility, on older machine types we set it to the max
available page size for the host.  For the 3.0 machine type, we fix it to
16, the intention being to only allow HPT pagesizes up to 64kiB by default
in future.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-06-22 14:19:07 +10:00
Cédric Le Goater 71b5c8d26e spapr: remove unused spapr_irq routines
spapr_irq_alloc_block and spapr_irq_alloc() are now deprecated.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-21 21:22:53 +10:00
Cédric Le Goater 4fe75a8ccd spapr: split the IRQ allocation sequence
Today, when a device requests for IRQ number in a sPAPR machine, the
spapr_irq_alloc() routine first scans the ICSState status array to
find an empty slot and then performs the assignement of the selected
numbers. Split this sequence in two distinct routines : spapr_irq_find()
for lookups and spapr_irq_claim() for claiming the IRQ numbers.

This will ease the introduction of a static layout of IRQ numbers.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-21 21:22:53 +10:00
David Gibson 9f6edd066e spapr: Compute effective capability values earlier
Previously, the effective values of the various spapr capability flags
were only determined at machine reset time.  That was a lazy way of making
sure it was after cpu initialization so it could use the cpu object to
inform the defaults.

But we've now improved the compat checking code so that we don't need to
instantiate the cpus to use it.  That lets us move the resolution of the
capability defaults much earlier.

This is going to be necessary for some future capabilities.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2018-06-21 21:22:53 +10:00
David Gibson ad99d04c76 target/ppc: Allow cpu compatiblity checks based on type, not instance
ppc_check_compat() is used in a number of places to check if a cpu object
supports a certain compatiblity mode, subject to various constraints.

It takes a PowerPCCPU *, however it really only depends on the cpu's class.
We have upcoming cases where it would be useful to make compatibility
checks before we fully instantiate the cpu objects.

ppc_type_check_compat() will now make an equivalent check, but based on a
CPU's QOM typename instead of an instantiated CPU object.

We make use of the new interface in several places in spapr, where we're
essentially making a global check, rather than one specific to a particular
cpu.  This avoids some ugly uses of first_cpu to grab a "representative"
instance.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2018-06-21 21:22:53 +10:00
Greg Kurz b94020268e spapr_cpu_core: migrate per-CPU data
A per-CPU machine data pointer was recently added to PowerPCCPU. The
motivation is to to hide platform specific details from the core CPU
code. This per-CPU data can hold state which is relevant to the guest
though, eg, Virtual Processor Areas, and we should migrate this state.

This patch adds the plumbing so that we can migrate the per-CPU data
for PAPR guests. We only do this for newer machine types for the sake
of backward compatibility. No state is migrated for the moment: the
vmstate_spapr_cpu_state structure will be populated by subsequent
patches.

Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Fix some trivial spelling and spacing errors]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-21 21:22:53 +10:00
Greg Kurz 844afc54ae spapr: fix xics_system_init() error path
Commit 3d85885a1b tried to fix error handling, but it actually
went into the wrong direction by dropping the local Error *.

In the default KVM case, the rationale is to try the in-kernel XICS first,
and if not possible, to fallback to userland XICS. Passing errp everywhere
makes this fallback impossible if errp is &error_fatal (which happens to
be the case). And anyway, if the caller would pass a regular &local_err,
things would be worse: we could possibly pass an already set *errp to
error_setg() and crash, or return an error even in case of success.

So we definitely need a local Error * and only propagate it when we're
done with the fallback logic. This is what this patch does.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-18 09:43:19 +10:00
David Hildenbrand a4261be172 spapr: handle cpu core unplug via hotplug handler chain
Factor out cpu core unplug into separate function from
spapr_core_release(). Then use generic hotplug_handler_unplug() to trigger
cpu core unplug, which would call spapr_machine_device_unplug() ->
spapr_core_unplug() in the end.

This way unplug operation is not buried in spapr internals and located
in the same place like in other targets, following similar
logic/call chain across targets.

Acked-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
David Hildenbrand 3ec71474ca spapr: handle pc-dimm unplug via hotplug handler chain
Factor out memory unplug into separate function from spapr_lmb_release().
Then use generic hotplug_handler_unplug() to trigger memory unplug,
which will call spapr_machine_device_unplug() -> spapr_memory_unplug()
in the end.

This way unplug operation is not buried in lmb internals and located in
the same place like in other targets, following similar logic/call chain
across targets.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
David Hildenbrand 88432f44aa spapr: introduce machine unplug handler
We'll be handling unplug of e.g. CPUs and PCDIMMs  via the general
hotplug handler soon, so let's add that handler function.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
David Hildenbrand 4e8a01bdb2 spapr: move memory hotplug support check into spapr_memory_pre_plug()
Let's finish cleaning up the hotplug handler. This check can be
performed in the pre_plug code as the very first thing.

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
David Hildenbrand 81985f3be9 spapr: move lookup of the node into spapr_memory_plug()
Let's clean the hotplug handler up by moving lookup of the node into
the function where it is actually being used.

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
David Hildenbrand fcc8ef17e2 spapr: no need to verify the node
The node property can always be queried and the value has already been
verified in pc_dimm_realize().

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-06-12 10:44:36 +10:00
Peter Maydell afd76ffba9 * Linux header upgrade (Peter)
* firmware.json definition (Laszlo)
 * IPMI migration fix (Corey)
 * QOM improvements (Alexey, Philippe, me)
 * Memory API cleanups (Jay, me, Tristan, Peter)
 * WHPX fixes and improvements (Lucian)
 * Chardev fixes (Marc-André)
 * IOMMU documentation improvements (Peter)
 * Coverity fixes (Peter, Philippe)
 * Include cleanup (Philippe)
 * -clock deprecation (Thomas)
 * Disable -sandbox unless CONFIG_SECCOMP (Yi Min Zhao)
 * Configurability improvements (me)
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAlsRd2UUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPG8Qf+M85E8xAQ/bhs90tAymuXkUUsTIFF
 uI76K8eM0K3b2B+vGckxh1gyN5O3GQaMEDL7vITfqbX+EOH5U2lv8V9JRzf2YvbG
 Zahjd4pOCYzR0b9JENA1r5U/J8RntNrBNXlKmGTaXOaw9VCXlZyvgVd9CE3z/e2M
 0jSXMBdF4LB3UzECI24Va8ejJxdSiJcqXA2j3J+pJFxI698i+Z5eBBKnRdo5TVe5
 jl0TYEsbS6CLwhmbLXmt3Qhq+ocZn7YH9X3HjkHEdqDUeYWyT9jwUpa7OHFrIEKC
 ikWm9er4YDzG/vOC0dqwKbShFzuTpTJuMz5Mj4v8JjM/iQQFrp4afjcW2g==
 =RS/B
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Linux header upgrade (Peter)
* firmware.json definition (Laszlo)
* IPMI migration fix (Corey)
* QOM improvements (Alexey, Philippe, me)
* Memory API cleanups (Jay, me, Tristan, Peter)
* WHPX fixes and improvements (Lucian)
* Chardev fixes (Marc-André)
* IOMMU documentation improvements (Peter)
* Coverity fixes (Peter, Philippe)
* Include cleanup (Philippe)
* -clock deprecation (Thomas)
* Disable -sandbox unless CONFIG_SECCOMP (Yi Min Zhao)
* Configurability improvements (me)

# gpg: Signature made Fri 01 Jun 2018 17:42:13 BST
# gpg:                using RSA key BFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (56 commits)
  hw: make virtio devices configurable via default-configs/
  hw: allow compiling out SCSI
  memory: Make operations using MemoryRegionIoeventfd struct pass by pointer.
  char: Remove unwanted crlf conversion
  qdev: Remove DeviceClass::init() and ::exit()
  qdev: Simplify the SysBusDeviceClass::init path
  hw/i2c: Use DeviceClass::realize instead of I2CSlaveClass::init
  hw/i2c/smbus: Use DeviceClass::realize instead of SMBusDeviceClass::init
  target/i386/kvm.c: Remove compatibility shim for KVM_HINTS_REALTIME
  Update Linux headers to 4.17-rc6
  target/i386/kvm.c: Handle renaming of KVM_HINTS_DEDICATED
  scripts/update-linux-headers: Handle kernel license no longer being one file
  scripts/update-linux-headers: Handle __aligned_u64
  virtio-gpu-3d: Define VIRTIO_GPU_CAPSET_VIRGL2 elsewhere
  gdbstub: Prevent fd leakage
  docs/interop: add "firmware.json"
  ipmi: Use proper struct reference for KCS vmstate
  vmstate: Add a VSTRUCT type
  tcg: remove softfloat from --disable-tcg builds
  qemu-options: Mark the non-functional -clock option as deprecated
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-06-01 18:24:16 +01:00
Philippe Mathieu-Daudé 0304f9ec9c hw: Do not include "sysemu/block-backend.h" if it is not necessary
Remove those unneeded includes to speed up the compilation
process a little bit. (Continue 7eceff5b5a cleanup)

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180528232719.4721-13-f4bug@amsat.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01 14:15:10 +02:00
Peter Maydell d8c0c7af80 ppc: Rename 2.13 machines to 3.0
Rename the 2.13 machines to match the number we're going to
use for the next release.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-id: 20180522104000.9044-5-peter.maydell@linaro.org
2018-05-29 11:28:46 +01:00
Igor Mammedov debbdc0018 make sure that we aren't overwriting mc->get_hotplug_handler by accident
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1525691524-32265-5-git-send-email-imammedo@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-05-10 18:10:56 +01:00
David Hildenbrand 0c9269a52d spapr: rename "hotplug memory" terminology to "device memory"
Let's make it clear at relevant places that we are dealing with device
memory. That it can be used for memory hotplug is just a special case.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-11-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
[ehabkost: rebased series, solved conflicts at spapr.c]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand bd6c3e4a49 pc-dimm: pass in the machine and to the MemoryHotplugState
We use the machine internally either way, so let's just pass it in then.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-5-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand acc7fa17e6 pc-dimm: no need to pass the memory region
We can just query it ourselves. When unplugging, we should always be
able to the region (as it was previously plugged). E.g. PPC already
assumed that and used &error_abort.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand b0c14ec4ef machine: make MemoryHotplugState accessible via the machine
Let's allow to query the MemoryHotplugState directly from the machine.
If the pointer is NULL, the machine does not support memory devices. If
the pointer is !NULL, the machine supports memory devices and the
data structure contains information about the applicable physical
guest address space region.

This allows us to generically detect if a certain machine has support
for memory devices, and to generically manage it (find free address
range, plug/unplug a memory region).

We will rename "MemoryHotplugState" to something more meaningful
("DeviceMemory") after we completed factoring out the pc-dimm code into
MemoryDevice code.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
[ehabkost: rebased series, solved conflicts at spapr.c]
[ehabkost: squashed fix to use g_malloc0()]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
David Hildenbrand 2cc0e2e814 pc-dimm: factor out MemoryDevice interface
On the qmp level, we already have the concept of memory devices:
    "query-memory-devices"
Right now, we only support NVDIMM and PCDIMM.

We want to map other devices later into the address space of the guest.
Such device could e.g. be virtio devices. These devices will have a
guest memory range assigned but won't be exposed via e.g. ACPI. We want
to make them look like memory device, but not glued to pc-dimm.

Especially, it will not always be possible to have TYPE_PC_DIMM as a parent
class (e.g. virtio devices). Let's use an interface instead. As a first
part, convert handling of
- qmp_pc_dimm_device_list
- get_plugged_memory_size
to our new model. plug/unplug stuff etc. will follow later.

A memory device will have to provide the following functions:
- get_addr(): Necessary, as the property "addr" can e.g. not be used for
              virtio devices (already defined).
- get_plugged_size(): The amount this device offers to the guest as of
                      now.
- get_region_size(): Because this can later on be bigger than the
                     plugged size.
- fill_device_info(): Fill MemoryDeviceInfo, e.g. for qmp.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180423165126.15441-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2018-05-07 10:00:02 -03:00
Greg Kurz 0550b1206a spapr: don't advertise radix GTSE if max-compat-cpu < power9
On a POWER9 host, if a guest runs in pre POWER9 compat mode, it necessarily
uses the hash MMU mode. In this case, we shouldn't advertise radix GTSE in
the ibm,arch-vec-5-platform-support DT property as the current code does.
The first reason is that it doesn't make sense, and the second one is that
causes the CAS-negotiated options subsection to be migrated. This breaks
backward migration to QEMU 2.7 and older versions on POWER8 hosts:

qemu-system-ppc64: error while loading state for instance 0x0 of device
 'spapr'
qemu-system-ppc64: load of migration failed: No such file or directory

This patch hence initialize CPUs a bit earlier so that we can check the
requested compat mode, and don't set OV5_MMU_RADIX_GTSE for power8 and
older.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-05-04 15:00:37 +10:00
Greg Kurz aef19c04bf spapr: don't migrate "spapr_option_vector_ov5_cas" to pre 2.8 machines
a324d6f166 "spapr: Support ibm,dynamic-memory-v2 property" added
a new feature in the set of CAS-negotiatable options. This causes
the CAS-negotiated options subsection to be migrated, even for old
machine types that don't know about it, and breaks backward migration
to QEMU 2.7 and older versions:

qemu-system-ppc64: error while loading state for instance 0x0 of device
 'spapr'
qemu-system-ppc64: load of migration failed: No such file or directory

Since this feature only affects boot time behaviour, it should be
filtered out when we decide to migrate CAS-negotiated options, like
we already do with OV5_FORM1_AFFINITY and OV5_DRCONF_MEMORY.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-05-04 15:00:37 +10:00
David Gibson 84369f639e spapr: Make a helper to set up cpu entry point state
Under PAPR, only the boot CPU is active when the system starts.  Other cpus
must be explicitly activated using an RTAS call.  The entry state for the
boot and secondary cpus isn't identical, but it has some things in common.
We're going to add a bit more common setup later, too, so to simplify
make a helper which sets up the common entry state for both boot and
secondary cpu threads.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2018-05-04 15:00:37 +10:00
David Gibson 090052aa08 spapr: Remove support for explicitly allocated RMAs
Current POWER cpus allow for a VRMA, a special mapping which describes a
guest's view of memory when in real mode (MMU off, from the guest's point
of view).  Older cpus didn't have that which meant that to support a guest
a special host-contiguous region of memory was needed to give the guest its
Real Mode Area (RMA).

KVM used to provide special calls to allocate a contiguous RMA for those
cases.  This was useful in the early days of KVM on Power to allow it to be
tested on PowerPC 970 chips as used in Macintosh G5 machines.  Now, those
machines are so old as to be almost irrelevant.

The normal qemu deprecation process would require this to be marked
deprecated then removed in 2 releases.  However, this can only be used
with corresponding support in the host kernel - which was dropped
years ago (in c17b98cf "KVM: PPC: Book3S HV: Remove code for PPC970
processors" of 2014-12-03 to be precise).  Therefore it should be ok
to drop this immediately.

Just to be clear this only affects *KVM HV* guests with PowerPC 970,
and those already require an ancient host kernel.  TCG and KVM PR
guests with PowerPC 970 should still work.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Thomas Huth <thuth@redhat.com>
2018-05-04 11:15:18 +10:00
Bharata B Rao a324d6f166 spapr: Support ibm,dynamic-memory-v2 property
The new property ibm,dynamic-memory-v2 allows memory to be represented
in a more compact manner in device tree.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27 18:05:23 +10:00
Serhii Popovych da9f80fbad spapr: Add ibm,max-associativity-domains property
Now recent kernels (i.e. since linux-stable commit a346137e9142
("powerpc/numa: Use ibm,max-associativity-domains to discover possible nodes")
support this property to mark initially memory-less NUMA nodes as "possible"
to allow further memory hot-add to them.

Advertise this property for pSeries machines to let guest kernels detect
maximum supported node configuration and benefit from kernel side change
when hot-add memory to specific, possibly empty before, NUMA node.

Signed-off-by: Serhii Popovych <spopovyc@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2018-04-27 18:05:23 +10:00