memory_device_get_free_addr() dereferences @errp when
memory_device_check_addable() fails. That's wrong; see the big
comment in error.h. Introduced in commit 1b6d6af21b "pc-dimm: factor
out capacity and slot checks into MemoryDevice".
No caller actually passes null.
Fix anyway: splice in a local Error *err, and error_propagate().
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20191204093625.14836-11-armbru@redhat.com>
isa_ipmi_bt_realize(), ipmi_isa_realize(), pci_ipmi_bt_realize(), and
pci_ipmi_kcs_realize() dereference @errp when IPMIInterfaceClass
method init() fails. That's wrong; see the big comment in error.h.
Introduced in commit 0719029c47 "ipmi: Add an ISA KCS low-level
interface", then imitated in commit a9b74079cb "ipmi: Add a BT
low-level interface" and commit 12f983c6aa "ipmi: Add PCI IPMI
interfaces".
No caller actually passes null.
Fix anyway: splice in a local Error *err, and error_propagate().
Cc: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20191204093625.14836-9-armbru@redhat.com>
fit_load_fdt() passes @errp to fit_image_addr(), then recovers from
ENOENT failures. Passing @errp is wrong, because it works only as
long as @errp is neither @error_fatal nor @error_abort. Error
recovery dereferences @errp. That's also wrong; see the big comment
in error.h. Error recovery can leave *errp pointing to a freed
Error object. Wrong, it must be null on success. Messed up in
commit 3eb99edb48 "loader-fit: Wean off error_printf()".
No caller actually passes such values, or uses *errp on success.
Fix anyway: splice in a local Error *err, and error_propagate().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20191204093625.14836-8-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
legacy_acpi_cpu_plug_cb() dereferences @errp when
acpi_set_cpu_present_bit() fails. That's wrong; see the big comment
in error.h. Introduced in commit cc43364de7 "acpi/cpu-hotplug:
introduce helper function to keep bit setting in one place".
No caller actually passes null, and acpi_set_cpu_present_bit() can't
actually fail.
Fix anyway: drop acpi_set_cpu_present_bit()'s @errp parameter.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <20191204093625.14836-7-armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Devices "ivshmem-plain" and "ivshmem-doorbell" support only MSI-X.
Config space register Interrupt Pin is zero. Device "ivshmem"
additionally supported legacy INTx, but it was removed in commit
5a0e75f0a9 "hw/misc/ivshmem: Remove deprecated "ivshmem" legacy
device". The commit left ivshmem_update_irq() behind. Since the
Interrupt Pin register is zero, the function does nothing. Remove it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20191205203557.11254-1-armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
It's been deprecated since QEMU v3.1. We've explicitly asked in the
deprecation message that people should speak up on qemu-devel in case
they are still actively using the bluetooth part of QEMU, but nobody
ever replied that they are really still using it.
I've tried it on my own to use this bluetooth subsystem for one of my
guests, but I was also not able to get it running anymore: When I was
trying to pass-through a real bluetooth device, either the guest did
not see the device at all, or the guest crashed.
Even worse for the emulated device: When running
qemu-system-x86_64 -bt device:keyboard
QEMU crashes once you hit a key.
So it seems like the bluetooth stack is not only neglected, it is
completely bitrotten, as far as I can tell. The only attention that
this code got during the past years were some CVEs that have been
spotted there. So this code is a burden for the developers, without
any real benefit anymore. Time to remove it.
Note: hw/bt/Kconfig only gets cleared but not removed here yet.
Otherwise there is a problem with the *-softmmu/config-devices.mak.d
dependency files - they still contain a reference to this file which
gets evaluated first on some build hosts, before the file gets
properly recreated. To avoid breaking these builders, we still need
the file around for some time. It will get removed in a couple of
weeks instead.
Message-Id: <20191120091014.16883-4-thuth@redhat.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
It isn't used anymore.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623844102.360005.12070225703151669294.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XSCOM bus is implemented with a QOM interface, which is mostly
generic from a CPU type standpoint, except for the computation of
addresses on the Pervasive Connect Bus (PCB) network. This is handled
by the pnv_xscom_pcba() function with a switch statement based on
the chip_type class level attribute of the CPU chip.
This can be achieved using QOM. Also the address argument is masked with
PNV_XSCOM_SIZE - 1, which is for POWER8 only. Addresses may have different
sizes with other CPU types. Have each CPU chip type handle the appropriate
computation with a QOM xscom_pcba() method.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623843543.360005.13996472463887521794.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since pnv_dt_xscom() is called from chip specific dt_populate() hooks,
it shouldn't have to guess the chip type in order to populate the
"compatible" property. Just pass the compat string and its size as
arguments.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623842430.360005.9513965612524265862.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since pnv_dt_xscom() is called from chip specific dt_populate() hooks,
it shouldn't have to guess the chip type in order to populate the "reg"
property. Just pass the base address and address size as arguments.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623841868.360005.17577624823547136435.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The pnv_chip_core_realize() function configures the XSCOM MMIO subregion
for each core of a single chip. The base address of the subregion depends
on the CPU type. Its computation is currently open-code using the
pnv_chip_is_powerXX() helpers. This can be achieved with QOM. Introduce
a method for this in the base chip class and implement it in child classes.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623841311.360005.4705705734873339545.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The pnv_pic_print_info() callback checks the type of the chip in order
to forward to the request appropriate interrupt controller. This can
be achieved with QOM. Introduce a method for this in the base chip class
and implement it in child classes.
This also prepares ground for the upcoming interrupt controller of POWER10
chips.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623840755.360005.5002022339473369934.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We add an extra node to advertise power management on some machines,
namely powernv9 and powernv10. This is achieved by using the
pnv_is_power9() and pnv_is_power10() helpers.
This can be achieved with QOM. Add a method to the base class for
powernv machines and have it implemented by machine types that
support power management instead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623839642.360005.9243510140436689941.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The pnv_dt_create() function generates different contents for the
"compatible" property of the root node in the DT, depending on the
CPU type. This is open coded with multiple ifs using pnv_is_powerXX()
helpers.
It seems cleaner to achieve with QOM. Introduce a base class for the
powernv machine and a compat attribute that each child class can use
to provide the value for the "compatible" property.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623839085.360005.4046508784077843216.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[dwg: Folded in small fix Greg spotted after posting]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It isn't used anymore.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623838530.360005.15470128760871845396.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The Processor Service Interface (PSI) model has a chip_type class level
attribute, which is used to generate the content of the "compatible" DT
property according to the CPU type.
Since the PSI model already has specialized classes for each supported
CPU type, it seems cleaner to achieve this with QOM. Provide the content
of the "compatible" property with a new class level attribute.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157623837974.360005.14706607446188964477.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The OCC common area is mapped at a unique address on the system and
each OCC is assigned a segment to expose its sensor data :
-------------------------------------------------------------------------
| Start (Offset from | End | Size |Description |
| BAR2 base address) | | | |
-------------------------------------------------------------------------
| 0x00580000 | 0x005A57FF |150kB |OCC 0 Sensor Data Block|
| 0x005A5800 | 0x005CAFFF |150kB |OCC 1 Sensor Data Block|
| : | : | : | : |
| 0x00686800 | 0x006ABFFF |150kB |OCC 7 Sensor Data Block|
| 0x006AC000 | 0x006FFFFF |336kB |Reserved |
-------------------------------------------------------------------------
Maximum size is 1.5MB.
We could define a "OCC common area" memory region at the machine level
and sub regions for each OCC. But it adds some extra complexity to the
models. Fix the current layout with a simpler model.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191211082912.2625-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PBA bridge unit (Power Bus Access) connects the OCC (On Chip
Controller) to the Power bus and System Memory. The PBA is used to
gather sensor data, for power management, for sleep states, for
initial boot, among other things.
The PBA logic provides a set of four registers PowerBus Access Base
Address Registers (PBABAR0..3) which map the OCC address space to the
PowerBus space. These registers are setup by the initial FW and define
the PowerBus Range of system memory that can be accessed by PBA.
The current modeling of the PBABAR registers is done under the common
XSCOM handlers. We introduce a specific XSCOM regions for these
registers and fix :
- BAR sizes and BAR masks
- The mapping of the OCC common area. It is common to all chips and
should be mapped once. We will address per-OCC area in the next
change.
- OCC common area is in BAR 3 on P8
Inspired by previous work of Balamuruhan S <bala24@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191211082912.2625-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Some devices could be initialized in the instance_init handler but not
realized for configuration reasons. Nodes should not be added in the DT
for such devices.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191210135845.19773-3-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Some PnvXScomInterface objects lie a bit deeper (PnvPBCQState) than
the first layer, so we need to loop on the whole object hierarchy to
catch them.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191210135845.19773-2-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
[dwg: Corrected error in comment]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The spr TBU40 is used to set the upper 40 bits of the timebase
register, present on POWER5+ and later processors.
This register can only be written by the hypervisor, and cannot be read.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191128134700.16091-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The Processor Utilisation of Resources Register (PURR) and Scaled
Processor Utilisation of Resources Register (SPURR) provide an estimate
of the resources used by the thread, present on POWER7 and later
processors.
Currently the [S]PURR registers simply count at the rate of the
timebase.
Preserve this behaviour but rework the implementation to store an offset
like the timebase rather than doing the calculation manually. Also allow
hypervisor write access to the register along with the currently
available read access.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[ clg: rebased on current ppc tree ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191128134700.16091-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The virtual timebase register (VTB) is a 64-bit register which
increments at the same rate as the timebase register, present on POWER8
and later processors.
The register is able to be read/written by the hypervisor and read by
the supervisor. All other accesses are illegal.
Currently the VTB is just an alias for the timebase (TB) register.
Implement the VTB so that is can be read/written independent of the TB.
Make use of the existing method for accessing timebase facilities where
by the compensation is stored and used to compute the value on reads/is
updated on writes.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[ clg: rebased on current ppc tree ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191128134700.16091-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Same a POWER9, only the MMIO window changes.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191205184454.10722-6-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The POWER10 PSIHB controller is very similar to the one on POWER9. We
should probably introduce a common PnvPsiXive object.
The ESB page size should be changed to 64k when P10 support is ready.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191205184454.10722-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is an empty shell with the XSCOM bus and cores. The chip controllers
will come later.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191205184454.10722-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The power7_set_irq() and power9_set_irq() functions set this but it is
never used actually. Modern Book3s compatible CPUs are only supported
by the pnv and spapr machines. They have an interrupt controller, XICS
for POWER7/8 and XIVE for POWER9, whose models don't require to track
IRQ input states at the CPU level.
Drop these lines to avoid confusion.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157548862861.3650476.16622818876928044450.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The correct way to do this is to deassert the input pins on the CPU side.
This is the case since a previous change.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157548862298.3650476.1228720391270249433.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When a CPU is reset, QEMU makes sure no interrupt is pending by clearing
CPUPPCstate::pending_interrupts in ppc_cpu_reset(). In the case of a
complete machine emulation, eg. a sPAPR machine, an external interrupt
request could still be pending in KVM though, eg. an IPI. It will be
eventually presented to the guest, which is supposed to acknowledge it at
the interrupt controller. If the interrupt controller is emulated in QEMU,
either XICS or XIVE, ppc_set_irq() won't deassert the external interrupt
pin in KVM since it isn't pending anymore for QEMU. When the vCPU re-enters
the guest, the interrupt request is still pending and the vCPU will try
again to acknowledge it. This causes an infinite loop and eventually hangs
the guest.
The code has been broken since the beginning. The issue wasn't hit before
because accel=kvm,kernel-irqchip=off is an awkward setup that never got
used until recently with the LC92x IBM systems (aka, Boston).
Add a ppc_irq_reset() function to do the necessary cleanup, ie. deassert
the IRQ pins of the CPU in QEMU and most importantly the external interrupt
pin for this vCPU in KVM.
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157548861740.3650476.16879693165328764758.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr_ovec_diff(ov, old, new) has somewhat complex semantics. ov is set
to those bits which are in new but not old, and it returns as a boolean
whether or not there are any bits in old but not new.
It turns out that both callers only care about the second, not the first.
This is basically equivalent to a bitmap subset operation, which is easier
to understand and implement. So replace spapr_ovec_diff() with
spapr_ovec_subset().
Cc: Mike Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cedric Le Goater <clg@fr.ibm.com>
spapr_h_cas_compose_response() handles the last piece of the PAPR feature
negotiation process invoked via the ibm,client-architecture-support OF
call. Its only caller is h_client_architecture_support() which handles
most of the rest of that process.
I believe it was placed in a separate file originally to handle some
fiddly dependencies between functions, but mostly it's just confusing
to have the CAS process split into two pieces like this. Now that
compose response is simplified (by just generating the whole device
tree anew), it's cleaner to just fold it into
h_client_architecture_support().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cedric Le Goater <clg@fr.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Previously, spapr_build_fdt() constructed the device tree in a fixed
buffer of size FDT_MAX_SIZE. This is a bit inflexible, but more
importantly it's awkward for the case where we use it during CAS. In
that case the guest firmware supplies a buffer and we have to
awkwardly check that what we generated fits into it afterwards, after
doing a lot of size checks during spapr_build_fdt().
Simplify this by having spapr_build_fdt() take a 'space' parameter.
For the CAS case, we pass in the buffer size provided by SLOF, for the
machine init case, we continue to pass FDT_MAX_SIZE.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cedric Le Goater <clg@fr.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
PAPR allows the interrupt controller used on a POWER9 machine (XICS or
XIVE) to be selected by the guest operating system, by using the
ibm,client-architecture-support (CAS) feature negotiation call.
Currently, if the guest selects an interrupt controller different from the
one selected at initial boot, this causes the system to be reset with the
new model and the boot starts again. This means we run through the SLOF
boot process twice, as well as any other bootloader (e.g. grub) in use
before the OS calls CAS. This can be confusing and/or inconvenient for
users.
Thanks to two fairly recent changes, we no longer need this reboot. 1) we
now completely regenerate the device tree when CAS is called (meaning we
don't need special case updates for all the device tree changes caused by
the interrupt controller mode change), 2) we now have explicit code paths
to activate and deactivate the different interrupt controllers, rather than
just implicitly calling those at machine reset time.
We can therefore eliminate the reboot for changing irq mode, simply by
putting a call to spapr_irq_update_active_intc() before we call
spapr_h_cas_compose_response() (which gives the updated device tree to
the guest firmware and OS).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cedric Le Goater <clg@fr.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Make kvmppc_hint_smt_possible hint append helper well formed:
rename errp to errp_in, as it is IN-parameter here (which is unusual
for errp), rename function to be kvmppc_error_append_*_hint.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20191127191434.20945-1-vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is useful to dump the saved contexts of the vCPUs : configuration
of the base END index of the vCPU and the Interrupt Pending Buffer
register, which is updated when an interrupt can not be presented.
When dumping the NVT table, we skip empty indirect pages which are not
necessarily allocated.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-21-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When doing CAM line compares, fetch the block id from the interrupt
controller which can have set the PC_TCTXT_CHIPID field.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-20-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When PC_TCTXT_CHIPID_OVERRIDE is configured, the PC_TCTXT_CHIPID field
overrides the hardwired chip ID in the Powerbus operations and for CAM
compares. This is typically used in the one block-per-chip configuration
to associate a unique block id number to each IC of the system.
Simplify the model with a pnv_xive_block_id() helper and remove
'tctx_chipid' which becomes useless.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-19-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When a vCPU is dispatched on a HW thread, its context is pushed in the
thread registers and it is activated by setting the VO bit in the CAM
line word2. The HW grabs the associated NVT, pulls the IPB bits and
merges them with the IPB of the new context. If interrupts were missed
while the vCPU was not dispatched, these are synthesized in this
sequence.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-18-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We will use it to resend missed interrupts when a vCPU context is
pushed on a HW thread.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-17-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It is now unused.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-16-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
On the P9 Processor, the thread interrupt context registers of a CPU
can be accessed "directly" when by load/store from the CPU or
"indirectly" by the IC through an indirect TIMA page. This requires to
configure first the PC_TCTXT_INDIRx registers.
Today, we rely on the get_tctx() handler to deduce from the CPU PIR
the chip from which the TIMA access is being done. By handling the
TIMA memory ops under the interrupt controller model of each machine,
we can uniformize the TIMA direct and indirect ops under PowerNV. We
can also check that the CPUs have been enabled in the XIVE controller.
This prepares ground for the future versions of XIVE.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-15-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The TIMA region gives access to the thread interrupt context registers
of a CPU. It is mapped at the same address on all chips and can be
accessed by any CPU of the system. To identify the chip from which the
access is being done, the PowerBUS uses a 'chip' field in the
load/store messages. QEMU does not model these messages, instead, we
extract the chip id from the CPU PIR and do a lookup at the machine
level to fetch the targeted interrupt controller.
Introduce pnv_get_chip() and pnv_xive_tm_get_xive() helpers to clarify
this process in pnv_xive_get_tctx(). The latter will be removed in the
subsequent patches but the same principle will be kept.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-14-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XIVE KVM devices now has an attribute to configure the number of
interrupt servers. This allows to greatly optimize the usage of the VP
space in the XIVE HW, and thus to start a lot more VMs.
Only set this attribute if available in order to support older POWER9
KVM.
The XIVE KVM device now reports the exhaustion of VPs upon the
connection of the first VCPU. Check that in order to have a chance
to provide a hint to the user.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157478679392.67101.7843580591407950866.stgit@bahia.tlslab.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XICS-on-XIVE KVM devices now has an attribute to configure the number
of interrupt servers. This allows to greatly optimize the usage of the VP
space in the XIVE HW, and thus to start a lot more VMs.
Only set this attribute if available in order to support older POWER9 KVM
and pre-POWER9 XICS KVM devices.
The XICS-on-XIVE KVM device now reports the exhaustion of VPs upon the
connection of the first VCPU. Check that in order to have a chance to
provide a hint to the user.
`
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157478678846.67101.9660531022460517710.stgit@bahia.tlslab.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XIVE and XICS-on-XIVE KVM devices on POWER9 hosts can greatly reduce
their consumption of some scarce HW resources, namely Virtual Presenter
identifiers, if they know the maximum number of vCPUs that may run in the
VM.
Prepare ground for this by passing the value down to xics_kvm_connect()
and kvmppc_xive_connect(). This is purely mechanical, no functional
change.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157478678301.67101.2717368060417156338.stgit@bahia.tlslab.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The TIMA operations are performed on behalf of the XIVE IVPE sub-engine
(Presenter) on the thread interrupt context registers. The current
operations supported by the model are simple and do not require access
to the controller but more complex operations will need access to the
controller NVT table and to its configuration.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-13-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that the machines have handlers implementing the XiveFabric and
XivePresenter interfaces, remove xive_presenter_match() and make use
of the 'match_nvt' handler of the machine.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-12-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The CAM line matching sequence in the pseries machine does not change
much apart from the use of the new QOM interfaces. There is an extra
indirection because of the sPAPR IRQ backend of the machine. Only the
XIVE backend implements the new 'match_nvt' handler.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-11-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>