Commit Graph

1921 Commits

Author SHA1 Message Date
Evgeny Yakovlev 5f2585772f virtio-blk: advertise F_WCE (F_FLUSH) if F_CONFIG_WCE is advertised
Virtio spec 1.1 (and earlier), 5.2.5.2 Driver Requirements: Device
Initialization:

"Devices SHOULD always offer VIRTIO_BLK_F_FLUSH, and MUST offer it if
they offer VIRTIO_BLK_F_CONFIG_WCE"

Currently F_CONFIG_WCE and F_WCE are not connected to each other.
Qemu will advertise F_CONFIG_WCE if config-wce argument is
set for virtio-blk device. And F_WCE is advertised only if
underlying block backend actually has it's caching enabled.

Fix this by advertising F_WCE if F_CONFIG_WCE is also advertised.

To preserve backwards compatibility with newer machine types make this
behaviour governed by "x-enable-wce-if-config-wce" virtio-blk-device
property and introduce hw_compat_4_2 with new property being off by
default for all machine types <= 4.2 (but don't introduce 4.3
machine type itself yet).

Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
Message-Id: <1572978137-189218-1-git-send-email-wrfsh@yandex-team.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2019-12-13 11:22:06 +00:00
PanNengyuan 59d0533b85 ppc/spapr_events: fix potential NULL pointer dereference in rtas_event_log_dequeue
This fixes coverity issues 68911917:
        360
    CID 68911917: (NULL_RETURNS)
        361. dereference: Dereferencing "source", which is known to be
             "NULL".
        361        if (source->mask & event_mask) {
        362            break;
        363        }

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: PanNengyuan <pannengyuan@huawei.com>
Message-Id: <1574685291-38176-1-git-send-email-pannengyuan@huawei.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-11-26 10:12:58 +11:00
David Gibson b14848f5d7 spapr: Work around spurious warnings from vfio INTx initialization
Traditional PCI INTx for vfio devices can only perform well if using
an in-kernel irqchip.  Therefore, vfio_intx_update() issues a warning
if an in kernel irqchip is not available.

We usually do have an in-kernel irqchip available for pseries machines
on POWER hosts.  However, because the platform allows feature
negotiation of what interrupt controller model to use, we don't
currently initialize it until machine reset.  vfio_intx_update() is
called (first) from vfio_realize() before that, so it can issue a
spurious warning, even if we will have an in kernel irqchip by the
time we need it.

To workaround this, make a call to spapr_irq_update_active_intc() from
spapr_irq_init() which is called at machine realize time, before the
vfio realize.  This call will be pretty much obsoleted by the later
call at reset time, but it serves to suppress the spurious warning
from VFIO.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
2019-11-26 10:11:30 +11:00
David Gibson e532e1d93c spapr: Handle irq backend changes with VFIO PCI devices
pseries machine type can have one of two different interrupt controllers in
use depending on feature negotiation with the guest.  Usually this is
invisible to devices, because they route to a common set of qemu_irqs which
in turn dispatch to the correct back end.

VFIO passthrough devices, however, wire themselves up directly to the KVM
irqchip for performance, which means they are affected by this change in
interrupt controller.  To get them to adjust correctly for the change in
irqchip, we need to fire the kvm irqchip change notifier.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
2019-11-26 10:11:30 +11:00
Alexey Kardashevskiy a49f62b9fd spapr: Add /chosen to FDT only at reset time to preserve kernel and initramdisk
Since "spapr: Render full FDT on ibm,client-architecture-support" we build
the entire flatten device tree (FDT) twice - at the reset time and
when "ibm,client-architecture-support" (CAS) is called. The full FDT from
CAS is then applied on top of the SLOF internal device tree.

This is mostly ok, however there is a case when the QEMU is started with
-initrd and for some reason the guest decided to move/unpack the init RAM
disk image - the guest correctly notifies SLOF about the change but
at CAS it is overridden with the QEMU initial location addresses and
the guest may fail to boot if the original initrd memory was changed.

This fixes the problem by only adding the /chosen node at the reset time
to prevent the original QEMU's linux,initrd-start/linux,initrd-end to
override the updated addresses.

This only treats /chosen differently as we know there is a special case
already and it is unlikely anything else will need to change /chosen at CAS
we are better off not touching /chosen after we handed it over to SLOF.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20191024041308.5673-1-aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2019-11-18 11:50:33 +01:00
Greg Kurz 0990ce6a2e ppc: Add intc_destroy() handlers to SpaprInterruptController/PnvChip
SpaprInterruptControllerClass and PnvChipClass have an intc_create() method
that calls the appropriate routine, ie. icp_create() or xive_tctx_create(),
to establish the link between the VCPU and the presenter component of the
interrupt controller during realize.

There aren't any symmetrical call to be called when the VCPU gets unrealized
though. It is assumed that object_unparent() is the only thing to do.

This is questionable because the parenting logic around the CPU and
presenter objects is really an implementation detail of the interrupt
controller. It shouldn't be open-coded in the machine code.

Fix this by adding an intc_destroy() method that undoes what was done in
intc_create(). Also NULLify the presenter pointers to avoid having
stale pointers around. This will allow to reliably check if a vCPU has
a valid presenter.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157192724208.3146912.7254684777515287626.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
2019-11-18 11:49:11 +01:00
Wei Yang 038adc2f58 core: replace getpagesize() with qemu_real_host_page_size
There are three page size in qemu:

  real host page size
  host page size
  target page size

All of them have dedicate variable to represent. For the last two, we
use the same form in the whole qemu project, while for the first one we
use two forms: qemu_real_host_page_size and getpagesize().

qemu_real_host_page_size is defined to be a replacement of
getpagesize(), so let it serve the role.

[Note] Not fully tested for some arch or device.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
Message-Id: <20191013021145.16011-3-richardw.yang@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-26 15:38:06 +02:00
Philippe Mathieu-Daudé 819ce6b2a5 hw: Move M48T59 device from hw/timer/ to hw/rtc/ subdirectory
The M48T59 is a Real Time Clock, not a timer.
Move it under the hw/rtc/ subdirectory.

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191003230404.19384-5-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-10-24 20:20:45 +02:00
Philippe Mathieu-Daudé bcdb90640a hw: Move MC146818 device from hw/timer/ to hw/rtc/ subdirectory
The MC146818 is a Real Time Clock, not a timer.
Move it under the hw/rtc/ subdirectory.

Use copyright statement from 80cabfad16 for "hw/rtc/mc146818rtc.h".

Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20191003230404.19384-4-philmd@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2019-10-24 20:13:10 +02:00
Peter Maydell 58560ad254 ppc patch queue 2019-10-24
Last pull request before soft freeze.
   * Lots of fixes and cleanups for spapr interrupt controllers
   * More SLOF updates to fix problems with full FDT rendering at CAS
     time (alas, more yet are to come)
   * A few other assorted changes
 
 This isn't quite as well tested as I usually try to do before a pull
 request.  But I've been sick and running into some other difficulties,
 and wanted to get this sent out before heading towards KVM forum.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl2xXWcACgkQbDjKyiDZ
 s5Jy/BAAsSo514vGCjdszXcRH3nFeODKJadlSsUX+32JFP1yJS9ooxkcmIN7o9Wp
 3VCkMHQPVV9jjIvvShWOSGfDDO3o8fTEucOIX/Nn9wfq+NiY+EJst0v+8OT48CSX
 LEXiy9Wghs9pZMLCUZ3rlLPBiU/Lhzf+KTCoUdc40tfoZMMz1lp/Uy8IdIYTYwLl
 /z++r4X8FOsXsDDsFopWffVdVBLJz6Var6NgBa8ISk2gGnUOAKsrTE3bD9L6n4PR
 YYbMSkv+SbvXg4gm53jUb9cQgpBqQpWHUYBIbKia/16EzbIkkZjFE2jGQMP5c72h
 ZOml7ZQtQVWIEEZwKPN67S8bKiVbEfayxHYViejn/uUqv3AwW0wi7FlBVv37YNJ4
 TxPvLBu+0DaFbk5y6/XHyL6XomG1/oH6qXOM2JhIWON7HI3rRWoMQbZ6QVJ1Gwk2
 uwrvOOL5kVZySotOw5bDkTXYp/Nm1JE4QwOXFPkXzaekcZhRlEqqrkBddhKtF80p
 1e5hGp5RgoILIe8uHJQ7decUMk889J7Qdtakv6BWvOci4dbIiZEp/smFlzgTcPnW
 DQJONP/awnoAOS3v0bItf59DROvkZ5xyv8yQZFP3qThSOfZl4e95WNRbtR3vtjU4
 Bl4Pdte15URKy5nM0XnnLg9mzl2xufdwEsu76lQMuYpe6nCI2h0=
 =FRHF
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.2-20191024' into staging

ppc patch queue 2019-10-24

Last pull request before soft freeze.
  * Lots of fixes and cleanups for spapr interrupt controllers
  * More SLOF updates to fix problems with full FDT rendering at CAS
    time (alas, more yet are to come)
  * A few other assorted changes

This isn't quite as well tested as I usually try to do before a pull
request.  But I've been sick and running into some other difficulties,
and wanted to get this sent out before heading towards KVM forum.

# gpg: Signature made Thu 24 Oct 2019 09:14:31 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-4.2-20191024: (28 commits)
  spapr/xive: Set the OS CAM line at reset
  ppc/pnv: Fix naming of routines realizing the CPUs
  ppc: Reset the interrupt presenter from the CPU reset handler
  ppc/pnv: Add a PnvChip pointer to PnvCore
  ppc/pnv: Introduce a PnvCore reset handler
  spapr_cpu_core: Implement DeviceClass::reset
  spapr: move CPU reset after presenter creation
  spapr: Don't request to unplug the same core twice
  pseries: Update SLOF firmware image
  spapr: Move SpaprIrq::nr_xirqs to SpaprMachineClass
  spapr: Remove SpaprIrq::nr_msis
  spapr, xics, xive: Move SpaprIrq::post_load hook to backends
  spapr, xics, xive: Move SpaprIrq::reset hook logic into activate/deactivate
  spapr: Remove SpaprIrq::init_kvm hook
  spapr, xics, xive: Match signatures for XICS and XIVE KVM connect routines
  spapr, xics, xive: Move dt_populate from SpaprIrq to SpaprInterruptController
  spapr, xics, xive: Move print_info from SpaprIrq to SpaprInterruptController
  spapr, xics, xive: Move set_irq from SpaprIrq to SpaprInterruptController
  spapr: Formalize notion of active interrupt controller
  spapr, xics, xive: Move irq claim and free from SpaprIrq to SpaprInterruptController
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-24 16:22:58 +01:00
Igor Mammedov 2def24f159 ppc: rs6000_mc: drop usage of memory_region_allocate_system_memory()
rs6000mc_realize() violates memory_region_allocate_system_memory() contract
by calling it multiple times which could break -mem-path. Replace it with
plain memory_region_init_ram() instead.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20191008113318.7012-3-imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-23 23:37:42 -03:00
Cédric Le Goater 00d6f4db60 ppc/pnv: Fix naming of routines realizing the CPUs
The 'vcpu' suffix is inherited from the sPAPR machine. Use better
names for PowerNV.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20191022163812.330-7-clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24 13:34:09 +11:00
Cédric Le Goater d49e8a9b46 ppc: Reset the interrupt presenter from the CPU reset handler
On the sPAPR machine and PowerNV machine, the interrupt presenters are
created by a machine handler at the core level and are reset
independently. This is not consistent and it raises issues when it
comes to handle hot-plugged CPUs. In that case, the presenters are not
reset. This is less of an issue in XICS, although a zero MFFR could
be a concern, but in XIVE, the OS CAM line is not set and this breaks
the presenting algorithm. The current code has workarounds which need
a global cleanup.

Extend the sPAPR IRQ backend and the PowerNV Chip class with a new
cpu_intc_reset() handler called by the CPU reset handler and remove
the XiveTCTX reset handler which is now redundant.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191022163812.330-6-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24 13:33:45 +11:00
Cédric Le Goater aa5ac64b23 ppc/pnv: Add a PnvChip pointer to PnvCore
We will use it to reset the interrupt presenter from the CPU reset
handler.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20191022163812.330-5-clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24 13:33:33 +11:00
Cédric Le Goater fa06541b5d ppc/pnv: Introduce a PnvCore reset handler
in which individual CPUs are reset. It will ease the introduction of
future change reseting the interrupt presenter from the CPU reset
handler.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20191022163812.330-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24 13:32:58 +11:00
Greg Kurz d1f2b4691a spapr_cpu_core: Implement DeviceClass::reset
Since vCPUs aren't plugged into a bus, we manually register a reset
handler for each vCPU. We also call this handler at realize time
to ensure hot plugged vCPUs are reset before being exposed to the
guest. This results in vCPUs being reset twice at machine reset.
It doesn't break anything but it is slightly suboptimal and above
all confusing.

The hotplug path in device_set_realized() already knows how to reset
a hotplugged device if the device reset handler is present. Implement
one for sPAPR CPU cores that resets all vCPUs under a core.

While here rename spapr_cpu_reset() to spapr_reset_vcpu() for
consistency with spapr_realize_vcpu() and spapr_unrealize_vcpu().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[clg: add documentation on the reset helper usage ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191022163812.330-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24 13:32:33 +11:00
Cédric Le Goater 90f8db52bb spapr: move CPU reset after presenter creation
This change prepares ground for future changes which will reset the
interrupt presenter in the reset handler of the sPAPR and PowerNV
cores.

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191022163812.330-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24 13:32:33 +11:00
Greg Kurz 47c8c915b1 spapr: Don't request to unplug the same core twice
We must not call spapr_drc_detach() on a detached DRC otherwise bad things
can happen, ie. QEMU hangs or crashes. This is easily demonstrated with
a CPU hotplug/unplug loop using QMP.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157185826035.3073024.1664101000438499392.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24 09:37:54 +11:00
David Gibson 54255c1f65 spapr: Move SpaprIrq::nr_xirqs to SpaprMachineClass
For the benefit of peripheral device allocation, the number of available
irqs really wants to be the same on a given machine type version,
regardless of what irq backends we are using.  That's the case now, but
only because we make sure the different SpaprIrq instances have the same
value except for the special legacy one.

Since this really only depends on machine type version, move the value to
SpaprMachineClass instead of SpaprIrq.  This also puts the code to set it
to the lower value on old machine types right next to setting
legacy_irq_allocation, which needs to go hand in hand.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
David Gibson 8cbe71ecb8 spapr: Remove SpaprIrq::nr_msis
The nr_msis value we use here has to line up with whether we're using
legacy or modern irq allocation.  Therefore it's safer to derive it based
on legacy_irq_allocation rather than having SpaprIrq contain a canned
value.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
David Gibson 605994e5b7 spapr, xics, xive: Move SpaprIrq::post_load hook to backends
The remaining logic in the post_load hook really belongs to the interrupt
controller backends, and just needs to be called on the active controller
(after the active controller is set to the right thing based on the
incoming migration in the generic spapr_irq_post_load() logic).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
David Gibson 567192d486 spapr, xics, xive: Move SpaprIrq::reset hook logic into activate/deactivate
It turns out that all the logic in the SpaprIrq::reset hooks (and some in
the SpaprIrq::post_load hooks) isn't really related to resetting the irq
backend (that's handled by the backends' own reset routines).  Rather its
about getting the backend ready to be the active interrupt controller or
stopping being the active interrupt controller - reset (and post_load) is
just the only time that changes at present.

To make this flow clearer, move the logic into the explicit backend
activate and deactivate hooks.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
David Gibson 0a17e0c39f spapr: Remove SpaprIrq::init_kvm hook
This hook is a bit odd.  The only caller is spapr_irq_init_kvm(), but
it explicitly takes an SpaprIrq *, so it's never really called through the
current SpaprIrq.  Essentially this is just a way of passing through a
function pointer so that spapr_irq_init_kvm() can handle some
configuration and error handling logic without duplicating it between the
xics and xive reset paths.

So, make it just take that function pointer.  Because of earlier reworks
to the KVM connect/disconnect code in the xics and xive backends we can
also eliminate some wrapper functions and streamline error handling a bit.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
David Gibson 98a39a7927 spapr, xics, xive: Match signatures for XICS and XIVE KVM connect routines
Both XICS and XIVE have routines to connect and disconnect KVM with
similar but not identical signatures.  This adjusts them to match
exactly, which will be useful for further cleanups later.

While we're there, we add an explicit return value to the connect path
to streamline error reporting in the callers.  We remove error
reporting the disconnect path.  In the XICS case this wasn't used at
all.  In the XIVE case the only error case was if the KVM device was
set up, but KVM didn't have the capability to do so which is pretty
obviously impossible.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
David Gibson 05289273c0 spapr, xics, xive: Move dt_populate from SpaprIrq to SpaprInterruptController
This method depends only on the active irq controller.  Now that we've
formalized the notion of active controller we can dispatch directly
through that, rather than dispatching via SpaprIrq with the dual
version having to do a second conditional dispatch.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
David Gibson 328d8eb24d spapr, xics, xive: Move print_info from SpaprIrq to SpaprInterruptController
This method depends only on the active irq controller.  Now that we've
formalized the notion of active controller we can dispatch directly
through that, rather than dispatching via SpaprIrq with the dual
version having to do a second conditional dispatch.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
David Gibson 7bcdbcca2f spapr, xics, xive: Move set_irq from SpaprIrq to SpaprInterruptController
This method depends only on the active irq controller.  Now that we've
formalized the notion of active controller we can dispatch directly through
that, rather than dispatching via SpaprIrq with the dual version having
to do a second conditional dispatch.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
David Gibson 81106ddd1a spapr: Formalize notion of active interrupt controller
spapr now has the mechanism of constructing both XICS and XIVE instances of
the SpaprInterruptController interface.  However, only one of the interrupt
controllers will actually be active at any given time, depending on feature
negotiation with the guest.  This is handled in the current code via
spapr_irq_current() which checks the OV5 vector from feature negotiation to
determine the current backend.

Determining the active controller at the point we need it like this
can be pretty confusing, because it makes it very non obvious at what
points the active controller can change.  This can make it difficult
to reason about the code and where a change of active controller could
appear in sequence with other events.

Make this mechanism more explicit by adding an 'active_intc' pointer
and an explicit spapr_irq_update_active_intc() function to update it
from the CAS state.  We also add hooks on the intc backend which will
get called when it is activated or deactivated.

For now we just introduce the switch and hooks, later patches will
actually start using them.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
David Gibson 0b0e52b131 spapr, xics, xive: Move irq claim and free from SpaprIrq to SpaprInterruptController
These methods, like cpu_intc_create, really belong to the interrupt
controller, but need to be called on all possible intcs.

Like cpu_intc_create, therefore, make them methods on the intc and
always call it for all existing intcs.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
David Gibson ebd6be089b spapr, xics, xive: Move cpu_intc_create from SpaprIrq to SpaprInterruptController
This method essentially represents code which belongs to the interrupt
controller, but needs to be called on all possible intcs, rather than
just the currently active one.  The "dual" version therefore calls
into the xics and xive versions confusingly.

Handle this more directly, by making it instead a method on the intc
backend, and always calling it on every backend that exists.

While we're there, streamline the error reporting a bit.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
David Gibson 150e25f85b spapr, xics, xive: Introduce SpaprInterruptController QOM interface
The SpaprIrq structure is used to represent ths spapr machine's irq
backend.  Except that it kind of conflates two concepts: one is the
backend proper - a specific interrupt controller that we might or
might not be using, the other is the irq configuration which covers
the layout of irq space and which interrupt controllers are allowed.

This leads to some pretty confusing code paths for the "dual"
configuration where its hooks redirect to other SpaprIrq structures
depending on the currently active irq controller.

To clean this up, we start by introducing a new
SpaprInterruptController QOM interface to represent strictly an
interrupt controller backend, not counting anything configuration
related.  We implement this interface in the XICs and XIVE interrupt
controllers, and in future we'll move relevant methods from SpaprIrq
into it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-24 09:36:55 +11:00
Greg Kurz 29cb418749 spapr: Set VSMT to smp_threads by default
Support for setting VSMT is available in KVM since linux-4.13. Most distros
that support KVM on POWER already have it. It thus seem reasonable enough
to have the default machine to set VSMT to smp_threads.

This brings contiguous VCPU ids and thus brings their upper bound down to
the machine's max_cpus. This is especially useful for XIVE KVM devices,
which may thus allocate only one VP descriptor per VCPU.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157010411885.246126.12610015369068227139.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24 09:36:55 +11:00
Cédric Le Goater 06d26eeb47 ppc/pnv: Use address_space_stq_be() when triggering an interrupt from PSI
Include the XIVE_TRIGGER_PQ bit in the trigger data which is how
hardware signals to the IC that the PQ bits of the interrupt source
have been checked.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191007084102.29776-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-24 09:36:55 +11:00
Tao Xu 0533ef5f20 numa: Introduce MachineClass::auto_enable_numa for implicit NUMA node
Add MachineClass::auto_enable_numa field. When it is true, a NUMA node
is expected to be created implicitly.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Tao Xu <tao3.xu@intel.com>
Message-Id: <20190905083238.1799-1-tao3.xu@intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2019-10-15 18:18:08 -03:00
Peter Maydell 0f0b43868a ppc patch queue 2019-10-04
Here's the next batch of ppc and spapr patches.  Includes:
   * Fist part of a large cleanup to irq infrastructure
   * Recreate the full FDT at CAS time, instead of making a difficult
     to follow set of updates.  This will help us move towards
     eliminating CAS reboots altogether
   * No longer provide RTAS blob to SLOF - SLOF can include it just as
     well itself, since guests will generally need to relocate it with
     a call to instantiate-rtas
   * A number of DFP fixes and cleanups from Mark Cave-Ayland
   * Assorted bugfixes
   * Several new small devices for powernv
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl2XEn0ACgkQbDjKyiDZ
 s5I6bA/7B5sjY/QxuE8axm5KupoAnE8zf205hN8mbYASwtDfFwgaeNreVaOSJUpr
 fgcx/g9G3rAryGZv3O6i02+wcRgNw1DnJ3ynCthIrExZEcfbTYJiS4s9apwPEQy8
 HFmBNdPDqrhFI0aFvXEUauiOp1aapPUUklm34eFscs94lJXxphRUEfa3XT5uEhUh
 xrIZwYq20A+ih4UHwk3Onyx/cvFpl6BRB2nVEllQFqzwF5eTTfz9t8+JGTebxD/7
 8qqt8ti0KM3wxSDTQnmyMUmpgy+C1iCvNYvv6nWFg+07QuGs48EHlQUUVVni4r9j
 kUrDwKS2eC+8e8gP/xdIXEq3R2DsAMq+wFIswXZ3X6x4DoUV0OAJSHc9iMD4l+pr
 LyWnVpDprc6XhJHWKpuHZ5w9EuBnZFbIXdlZGFno+8UvXtusnbbuwAZzHTrRJRqe
 /AWVpFwGAoOF4KxIOFlPVBI8m4vFad/soVojC0vzIbRqaogOFZAjiL/yD5GwLmMa
 tywOEMBUJ/j2lgudTCyKn5uCa/Ew3DS1TSdenJjyqRi/gZM0IaORIhJhyFYW/eO1
 U7Uh8BnbC+4J11wwvFR5+W789dgM2+EEtAX9uI08VcE/R2ASabZlN4Zwrl0w4cb/
 VRybMT4bgmjzHRpfrqYPxpn8wqPcIw0BCeipSOjY3QU1Q25TEYQ=
 =PXXe
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.2-20191004' into staging

ppc patch queue 2019-10-04

Here's the next batch of ppc and spapr patches.  Includes:
  * Fist part of a large cleanup to irq infrastructure
  * Recreate the full FDT at CAS time, instead of making a difficult
    to follow set of updates.  This will help us move towards
    eliminating CAS reboots altogether
  * No longer provide RTAS blob to SLOF - SLOF can include it just as
    well itself, since guests will generally need to relocate it with
    a call to instantiate-rtas
  * A number of DFP fixes and cleanups from Mark Cave-Ayland
  * Assorted bugfixes
  * Several new small devices for powernv

# gpg: Signature made Fri 04 Oct 2019 10:35:57 BST
# gpg:                using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-4.2-20191004: (53 commits)
  ppc/pnv: Remove the XICSFabric Interface from the POWER9 machine
  spapr: Eliminate SpaprIrq::init hook
  spapr: Add return value to spapr_irq_check()
  spapr: Use less cryptic representation of which irq backends are supported
  xive: Improve irq claim/free path
  spapr, xics, xive: Better use of assert()s on irq claim/free paths
  spapr: Handle freeing of multiple irqs in frontend only
  spapr: Remove unhelpful tracepoints from spapr_irq_free_xics()
  spapr: Eliminate SpaprIrq:get_nodename method
  spapr: Simplify spapr_qirq() handling
  spapr: Fix indexing of XICS irqs
  spapr: Eliminate nr_irqs parameter to SpaprIrq::init
  spapr: Clarify and fix handling of nr_irqs
  spapr: Replace spapr_vio_qirq() helper with spapr_vio_irq_pulse() helper
  spapr: Fold spapr_phb_lsi_qirq() into its single caller
  xics: Create sPAPR specific ICS subtype
  xics: Merge TYPE_ICS_BASE and TYPE_ICS_SIMPLE classes
  xics: Eliminate reset hook
  xics: Rename misleading ics_simple_*() functions
  xics: Eliminate 'reject', 'resend' and 'eoi' class hooks
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2019-10-07 13:49:02 +01:00
Eric Auger 549d400587 memory: allow memory_region_register_iommu_notifier() to fail
Currently, when a notifier is attempted to be registered and its
flags are not supported (especially the MAP one) by the IOMMU MR,
we generally abruptly exit in the IOMMU code. The failure could be
handled more nicely in the caller and especially in the VFIO code.

So let's allow memory_region_register_iommu_notifier() to fail as
well as notify_flag_changed() callback.

All sites implementing the callback are updated. This patch does
not yet remove the exit(1) in the amd_iommu code.

in SMMUv3 we turn the warning message into an error message saying
that the assigned device would not work properly.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-10-04 18:49:18 +02:00
Cédric Le Goater 1aba8716c8 ppc/pnv: Remove the XICSFabric Interface from the POWER9 machine
The POWER8 PowerNV machine needs to implement a XICSFabric interface
as this is the POWER8 interrupt controller model. But the POWER9
machine uselessly inherits of XICSFabric from the common PowerNV
machine definition.

Open code machine definitions to have a better control on the
different interfaces each machine should define.

Fixes: f30c843ced ("ppc/pnv: Introduce PowerNV machines with fixed CPU models")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191003143617.21682-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2019-10-04 19:08:23 +10:00
David Gibson f478d9af21 spapr: Eliminate SpaprIrq::init hook
This method is used to set up the interrupt backends for the current
configuration.  However, this means some confusing redirection between
the "dual" mode init and the init hooks for xics only and xive only modes.

Since we now have simple flags indicating whether XICS and/or XIVE are
supported, it's easier to just open code each initialization directly in
spapr_irq_init().  This will also make some future cleanups simpler.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04 19:08:23 +10:00
David Gibson 0a3fd3df6f spapr: Add return value to spapr_irq_check()
Explicitly return success or failure, rather than just relying on the
Error ** parameter.  This makes handling it less verbose in the caller.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04 19:08:23 +10:00
David Gibson ca62823b79 spapr: Use less cryptic representation of which irq backends are supported
SpaprIrq::ov5 stores the value for a particular byte in PAPR option vector
5 which indicates whether XICS, XIVE or both interrupt controllers are
available.  As usual for PAPR, the encoding is kind of overly complicated
and confusing (though to be fair there are some backwards compat things it
has to handle).

But to make our internal code clearer, have SpaprIrq encode more directly
which backends are available as two booleans, and derive the OV5 value from
that at the point we need it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04 19:08:23 +10:00
David Gibson e594c2ad1c xive: Improve irq claim/free path
spapr_xive_irq_claim() returns a bool to indicate if it succeeded.
But most of the callers and one callee use int return values and/or an
Error * with more information instead.  In any case, ints are a more
common idiom for success/failure states than bools (one never knows
what sense they'll be in).

So instead change to an int return value to indicate presence of error
+ an Error * to describe the details through that call chain.

It also didn't actually check if the irq was already claimed, which is
one of the primary purposes of the claim path, so do that.

spapr_xive_irq_free() also returned a bool... which no callers checked
and was always true, so just drop it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04 19:08:23 +10:00
David Gibson 580dde5e4a spapr, xics, xive: Better use of assert()s on irq claim/free paths
The irq claim and free paths for both XICS and XIVE check for some
validity conditions.  Some of these represent genuine runtime failures,
however others - particularly checking that the basic irq number is in a
sane range - could only fail in the case of bugs in the callin code.
Therefore use assert()s instead of runtime failures for those.

In addition the non backend-specific part of the claim/free paths should
only be used for PAPR external irqs, that is in the range SPAPR_XIRQ_BASE
to the maximum irq number.  Put assert()s for that into the top level
dispatchers as well.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04 19:08:23 +10:00
David Gibson f233cee97b spapr: Handle freeing of multiple irqs in frontend only
spapr_irq_free() can be used to free multiple irqs at once. That's useful
for its callers, but there's no need to make the individual backend hooks
handle this.  We can loop across the irqs in spapr_irq_free() itself and
have the hooks just do one at time.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-04 19:08:23 +10:00
David Gibson 85d0425652 spapr: Remove unhelpful tracepoints from spapr_irq_free_xics()
These traces contain some useless information (the always-0 source#) and
have no equivalents for XIVE mode.  For now just remove them, and we can
put back something more sensible if and when we need it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-10-04 19:08:22 +10:00
David Gibson 14789694cd spapr: Eliminate SpaprIrq:get_nodename method
This method is used to determine the name of the irq backend's node in the
device tree, so that we can find its phandle (after SLOF may have modified
it from the phandle we initially gave it).

But, in the two cases the only difference between the node name is the
presence of a unit address.  Searching for a node name without considering
unit address is standard practice for the device tree, and
fdt_subnode_offset() will do exactly that, making this method unecessary.

While we're there, remove the XICS_NODENAME define.  The name
"interrupt-controller" is required by PAPR (and IEEE1275), and a bunch of
places assume it already.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04 19:08:22 +10:00
David Gibson af1861511d spapr: Simplify spapr_qirq() handling
Currently spapr_qirq(), whic is used to find the qemu_irq for an spapr
global irq number, redirects through the SpaprIrq::qirq method.  But
the array of qemu_irqs is allocated in the PAPR layer, not the
backends, and so the method implementations all return the same thing,
just differing in the preliminary checks they make.

So, we can remove the method, and just implement spapr_qirq() directly,
including all the relevant checks in one place.  We change all those
checks into assert()s as well, since a failure here indicates an error in
the calling code.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-10-04 19:08:22 +10:00
David Gibson 9f53c0db19 spapr: Fix indexing of XICS irqs
spapr global irq numbers are different from the source numbers on the ICS
when using XICS - they're offset by XICS_IRQ_BASE (0x1000).  But
spapr_irq_set_irq_xics() was passing through the global irq number to
the ICS code unmodified.

We only got away with this because of a counteracting bug - we were
incorrectly adjusting the qemu_irq we returned for a requested global irq
number.

That approach mostly worked but is very confusing, incorrectly relies on
the way the qemu_irq array is allocated, and undermines the intention of
having the global array of qemu_irqs for spapr have a consistent meaning
regardless of irq backend.

So, fix both set_irq and qemu_irq indexing.  We rename some parameters at
the same time to make it clear that they are referring to spapr global
irq numbers.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04 19:08:22 +10:00
David Gibson fe9b61b246 spapr: Eliminate nr_irqs parameter to SpaprIrq::init
The only reason this parameter was needed was to work around the
inconsistent meaning of nr_irqs between xics and xive.  Now that we've
fixed that, we can consistently use the number directly in the SpaprIrq
configuration.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
2019-10-04 19:08:22 +10:00
David Gibson ad8de98636 spapr: Clarify and fix handling of nr_irqs
Both the XICS and XIVE interrupt backends have a "nr-irqs" property, but
it means slightly different things.  For XICS (or, strictly, the ICS) it
indicates the number of "real" external IRQs.  Those start at XICS_IRQ_BASE
(0x1000) and don't include the special IPI vector.  For XIVE, however, it
includes the whole IRQ space, including XIVE's many IPI vectors.

The spapr code currently doesn't handle this sensibly, with the
nr_irqs value in SpaprIrq having different meanings depending on the
backend.  We fix this by renaming nr_irqs to nr_xirqs and making it
always indicate just the number of external irqs, adjusting the value
we pass to XIVE accordingly.  We also move to using common constants
in most of the irq configurations, to make it clearer that the IRQ
space looks the same to the guest (and emulated devices), even if the
backend is different.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
2019-10-04 19:08:22 +10:00
David Gibson 7678b74a94 spapr: Replace spapr_vio_qirq() helper with spapr_vio_irq_pulse() helper
Every caller of spapr_vio_qirq() immediately calls qemu_irq_pulse() with
the result, so we might as well just fold that into the helper.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
2019-10-04 19:08:22 +10:00