Commit Graph

29 Commits

Author SHA1 Message Date
David Gibson
74d042e5ce pseries: Implement qemu initiated shutdowns using EPOW events
At present, using 'system_powerdown' from the monitor or otherwise
instructing qemu to (cleanly) shut down a pseries guest will not work,
because we did not have a method of signalling the shutdown request to the
guest.

PAPR does include a usable mechanism for this, though it is rather more
involved than the equivalent on x86.  This involves sending an EPOW
(Environmental and POwer Warning) event through the PAPR event and error
logging mechanism, which also has a number of other functions.

This patch implements just enough of the event/error logging functionality
to be able to send a shutdown event to the guest.  At least with modern
guest kernels and a userspace that is up and running, this means that
system_powerdown from the qemu monitor should now work correctly on pseries
guests.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-29 11:45:54 +01:00
Avi Kivity
a8170e5e97 Rename target_phys_addr_t to hwaddr
target_phys_addr_t is unwieldly, violates the C standard (_t suffixes are
reserved) and its purpose doesn't match the name (most target_phys_addr_t
addresses are not target specific).  Replace it with a finger-friendly,
standards conformant hwaddr.

Outstanding patchsets can be fixed up with the command

  git rebase -i --exec 'find -name "*.[ch]"
                        | xargs s/target_phys_addr_t/hwaddr/g' origin

Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-10-23 08:58:25 -05:00
David Gibson
53724ee565 pseries: Rework implementation of TCE bypass
On the pseries machine the IOMMU (aka TCE tables) is always active for all
PCI and VIO devices.  Mostly to simplify the SLOF firmware, we implement an
extension which allows the IOMMU to be temporarily disabled for certain
devices.

Currently this is implemented by setting the device's DMAContext pointer to
NULL (thus reverting to qemu's default no-IOMMU DMA behaviour), then
replacing it when bypass mode is disabled.

This approach causes a bunch of complications though.  It complexifies the
management of the DMAContext lifetimes, it's problematic for savevm/loadvm,
and it means that while bypass is active we have nowhere to store the
device's LIOBN (Logical IO Bus Number, used to identify DMA address
spaces).  At present we regenerate the LIOBN from other address information
but this restricts how we can allocate LIOBNs.

This patch gives up on this approach, replacing it with the much simpler
one of having a 'bypass' boolean flag in the TCE state structure.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-04 15:54:18 +02:00
David Gibson
ff9d2afa61 pseries: Remove XICS irq type enum type
Currently the XICS interrupt controller emulation uses a custom enum to
specify whether a given interrupt is level-sensitive or message-triggered.
This enum makes life awkward for saving the state, and isn't particularly
useful since there are only two possibilities.  This patch replaces the
enum with a simple bool.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-04 15:54:18 +02:00
David Gibson
eddeed26ac pseries: Reset emulated PCI TCE tables on system reset
The emulated PCI host bridge on the pseries machine incorporates an IOMMU
(PAPR TCE table).  Currently the mappings in this IOMMU are not cleared
when we reset the system.  This patch fixes this bug.  To do this it adds
a new reset function to the IOMMU emulation code.  The VIO devices already
reset their TCE tables, but they do so by destroying and re-creating their
DMA context.  This doesn't work for the PCI host bridge, because the
infrastructure for PCI IOMMUs has already copied/cached the DMA pointer
context into the subordinate PCI device structures.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-04 15:54:17 +02:00
David Gibson
7f763a5d99 pseries: Add support for new KVM hash table control call
This adds support for then new "reset htab" ioctl which allows qemu
to properly cleanup the MMU hash table when the guest is reset. With
the corresponding kernel support, reset of a guest now works properly.

This also paves the way for indicating a different size hash table
to the kernel and for the kernel to be able to impose limits on
the requested size.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-10-04 15:54:17 +02:00
Alexey Kardashevskiy
5c4cbcf26c pseries dma: DMA window params added to PHB and DT population changed
Previously the only PCI bus supported was the emulated PCI bus with
fixed DMA window with start at 0 and size 1GB. As we are going to support
PCI pass through which DMA window properties are set by the host
kernel, we have to support DMA windows with parameters other than default.

This patch adds:

1. DMA window properties to sPAPRPHBState: LIOBN (bus id), start,
size of the window.

2. An additional function spapr_dma_dt() to populate DMA window
properties in the device tree which simply accepts all the parameters
and does not try to guess what kind of IOMMU is given to it.
The original spapr_dma_dt() is renamed to spapr_tcet_dma_dt().

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-08-15 19:43:16 +02:00
Alexey Kardashevskiy
f4b9523ba6 pseries: added allocator for a block of IRQs
The patch adds a simple helper which allocates a consecutive sequence
of IRQs calling spapr_allocate_irq for each and checks that allocated
IRQs go consequently.

The patch is required for upcoming support of MSI/MSIX on POWER.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-08-15 19:43:16 +02:00
Alexey Kardashevskiy
a307d59434 pseries: Rework irq assignment to avoid carrying qemu_irqs around
Currently, the interfaces in the pseries machine code for assignment
and setup of interrupts pass around qemu_irq objects.  That was done
in an attempt not to be too closely linked to the specific XICS
interrupt controller.  However interactions with the device tree setup
made that attempt rather futile, and XICS is part of the PAPR spec
anyway, so this really just meant we had to carry both the qemu_irq
pointers and the XICS irq numbers around.

This mess will just get worse when we add upcoming PCI MSI support,
since that will require tracking a bunch more interrupt.  Therefore,
this patch reworks the spapr code to just use XICS irq numbers
(roughly equivalent to GSIs on x86) and only retrieve the qemu_irq
pointers from the XICS code when we need them (a trivial lookup).

This is a reworked and generalized version of an earlier spapr_pci
specific patch from Alexey Kardashevskiy.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[agraf: fix checkpath warning]
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-08-15 19:43:16 +02:00
Alexander Graf
3fc5acdeed PPC: spapr: Remove global variable
Global variables are bad. Let's move spapr_has_graphics into the
machine state struct.

Signed-off-by: Alexander Graf <agraf@suse.de>
2012-08-15 19:43:15 +02:00
David Gibson
edded45406 pseries: Implement IOMMU and DMA for PAPR PCI devices
Currently the pseries machine emulation does not support DMA for emulated
PCI devices, because the PAPR spec always requires a (guest visible,
paravirtualized) IOMMU which was not implemented.  Now that we have
infrastructure for IOMMU emulation, we can correct this and allow PCI DMA
for pseries.

With the existing PAPR IOMMU code used for VIO devices, this is almost
trivial. We use a single DMAContext for each (virtual) PCI host bridge,
which is the usual configuration on real PAPR machines (which often have
_many_ PCI host bridges).

Cc: Alex Graf <agraf@suse.de>

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-06-27 16:33:26 -05:00
David Gibson
ad0ebb91cd pseries: Convert sPAPR TCEs to use generic IOMMU infrastructure
The pseries platform already contains an IOMMU implementation, since it is
essential for the platform's paravirtualized VIO devices.  This IOMMU
support is currently built into the implementation of the VIO "bus" and
the various VIO devices.

This patch converts this code to make use of the new common IOMMU
infrastructure.

We don't yet handle synchronization of map/unmap callbacks vs. invalidations,
this will require some complex interaction with the kernel and is not a
major concern at this stage.

Cc: Alex Graf <agraf@suse.de>

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-06-27 16:33:25 -05:00
Benjamin Herrenschmidt
c73e3771ea spapr: Add "memop" hypercall
This adds a qemu-specific hypervisor call to the pseries machine
which allows to do what amounts to memmove, memcpy and xor over
regions of physical memory such as the framebuffer.

This is the simplest way to get usable framebuffer speed from
SLOF since the framebuffer isn't mapped in the VRMA and so would
otherwise require an hcall per 8 bytes access.

The performance is still not great but usable, and can be improved
with a more complex implementation of the hcall itself if needed.

This also adds some documentation for the qemu-specific hypercalls
that we add to PAPR along with a new qemu,hypertas-functions property
that mirrors ibm,hypertas-functions and provides some discoverability
for the new calls.

Note: I chose note to advertise H_RTAS to the guest via that mechanism.
This is done on purpose, the guest uses the normal RTAS interfaces
provided by qemu (including SLOF) which internally calls H_RTAS.

We might in the future implement part (or even all) of RTAS inside the
guest like IBM's firmware does and replace H_RTAS with some finer grained
set of private hypercalls.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-06-24 01:04:45 +02:00
David Gibson
d9599c9205 pseries: Clean up hcall_dprintf() debugging messages
The pseries machine code has a number of debug messages for debugging PAPR
hypercalls, dependent on DEBUG_SPAPR_HCALLS.  This patch cleans these
messages up a bit, by adding __func__ to the hcall_dprintf() macro and
simplifying up a number of the individual messages accordingly.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2012-04-15 17:07:19 +02:00
David Gibson
d07fee7e8a pseries: Add support for level interrupts to XICS
The pseries "xics" interrupt controller, like most interrupt
controllers can support both message (i.e. edge sensitive) interrupts
and level sensitive interrupts, but it needs to know which are which.

When I implemented the xics emulation for qemu, the only devices we
supported were the PAPR virtual IO devices.  These devices only use
message interrupts, so they were the only ones I implemented in xics.

Since then, however, we have added support for PCI devices, which use
level sensitive interrupts.  It turns out the message interrupt logic
still actually works most of the time for these, but there are
circumstances where we can lost interrupts due to the incorrect
interrupt logic.

This patch, therefore, implements the correct xics level-sensitive
interrupt logic.  The type of the interrupt is set when a device
allocates a new xics interrupt.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-03-15 13:12:12 +01:00
Andreas Färber
e2684c0b58 ppc hw/: Don't use CPUState
Scripted conversion:
  for file in hw/ppc*.[hc] hw/mpc8544_guts.c hw/spapr*.[hc] hw/virtex_ml507.c hw/xics.c; do
    sed -i "s/CPUState/CPUPPCState/g" $file
  done

Signed-off-by: Andreas Färber <afaerber@suse.de>
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
2012-03-14 22:20:26 +01:00
Bharata B Rao
6e806cc38b pseries: FDT NUMA extensions to support multi-node guests
Add NUMA specific properties to guest's device tree to boot a multi-node
guests. This patch adds the following properties:

ibm,associativity
ibm,architecture-vec-5
ibm,associativity-reference-points

With this, it becomes possible to use -numa option on pseries targets.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2012-01-03 15:49:11 +01:00
Dong Xu Wang
66a0a2cb81 fix spelling in hw sub directory
Correct obvious spelling errors in qemu/hw directory.

Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-12-06 09:56:41 +00:00
David Gibson
3384f95c59 pseries: Add partial support for PCI
This patch adds a PCI bus to the pseries machine.  This instantiates
the qemu generic PCI bus code, advertises a PCI host bridge in the
guest's device tree and implements the RTAS methods specified by PAPR
to access PCI config space.  It also sets up the memory regions we
need to provide windows into the PCI memory and IO space, and
advertises those to the guest.

However, because qemu can't yet emulate an IOMMU, which is mandatory on
pseries, PCI devices which use DMA (i.e. most of them) will not work with
this code alone.  Still, this is enough to support the virtio_pci device
(which probably _should_ use emulated PCI DMA, but is specced to use
direct hypervisor access to guest physical memory instead).

[agraf] remove typedef which could cause compile errors

Signed-off-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-10-31 04:53:01 +01:00
Breno Leitao
ac26f8c389 pseries: Implement set-time-of-day RTAS function
Currently there is no implementation for set-time-of-day rtas function,
which causes the following warning "setting the clock failed (-1)" on
the guest.

This patch just creates this function, get the timedate diff and store in
the papr environment, so that the correct value will be returned by
get-time-of-day.

In order to try it, just adjust the hardware time, run hwclock --systohc,
so that, on when the system runs hwclock --hctosys, the value is correctly
adjusted, i.e. the host time plus the timediff.

Signed-off-by: Breno Leitao <brenohl@br.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-10-06 09:48:09 +02:00
David Gibson
e6c866d417 pseries: Refactor spapr irq allocation
Paulo Bonzini changed the original spapr code, which manually assigned irq
numbers for each virtual device, to allocate them automatically from the
device initialization. That allowed spapr virtual devices to be constructed
with -device, which is a good start.  However, the way that patch worked
doesn't extend nicely for the future when we want to support devices other
than sPAPR VIO devices (e.g. virtio and PCI).

This patch rearranges the irq allocation to be global across the sPAPR
environment, so it can be used by other bus types as well.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-10-06 09:48:09 +02:00
David Gibson
f73a2575a3 pseries: More complete WIMG validation in H_ENTER code
Currently our implementation of the H_ENTER hypercall, which inserts a
mapping in the hash page table assumes that only ordinary memory is ever
mapped, and only permits mapping attribute bits accordingly (WIMG==0010).

However, we intend to start adding emulated IO to the pseries platform
(and real IO with PCI passthrough on kvm) which means this simple test
will no longer suffice.

This patch extends the h_enter validation code to check if the given
address is a RAM address.  If it is it enforces WIMG==0010, otherwise
it assumes that it is an IO mapping and instead enforces WIMG=010x.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-10-06 09:48:03 +02:00
Paolo Bonzini
277f9acf79 spapr: proper qdevification
Right now the spapr devices cannot be instantiated with -device,
because the IRQs need to be passed to the spapr_*_create functions.
Do this instead in the bus's init wrapper.

This is particularly important with the conversion from scsi-disk
to scsi-{cd,hd} that Markus made.  After his patches, if you
specify a scsi-cd device attached to an if=none drive, the default
VSCSI controller will not be created and, without qdevification,
you will not be able to add yours.

NOTE from agraf: added small compile fix

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-10-06 09:43:32 +02:00
Alexander Graf
06c46bbab0 spapr: use specific endian ld/st_phys
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2011-07-12 20:00:33 +00:00
David Gibson
a3467baa88 Delay creation of pseries device tree until reset
At present, the 'pseries' machine creates a flattened device tree in the
machine->init function to pass to either the guest kernel or to firmware.

However, the machine->init function runs before processing of -device
command line options, which means that the device tree so created will
be (incorrectly) missing devices specified that way.

Supplying a correct device tree is, in any case, part of the required
platform entry conditions.  Therefore, this patch moves the creation and
loading of the device tree from machine->init to a reset callback.  The
setup of entry point address and initial register state moves with it,
which leads to a slight cleanup.

This is not, alas, quite enough to make a fully working reset for pseries.
For that we would need to reload the firmware images, which on this
machine are loaded into RAM.  It's a step in the right direction, though.

Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-04-08 11:32:21 +02:00
David Gibson
b5cec4c5f2 Implement the PAPR (pSeries) virtualized interrupt controller (xics)
PAPR defines an interrupt control architecture which is logically divided
into ICS (Interrupt Control Presentation, each unit is responsible for
presenting interrupts to a particular "interrupt server", i.e. CPU) and
ICS (Interrupt Control Source, each unit responsible for one or more
hardware interrupts as numbered globally across the system).  All PAPR
virtual IO devices expect to deliver interrupts via this mechanism.  In
Linux, this interrupt controller system is handled by the "xics" driver.

On pSeries systems, access to the interrupt controller is virtualized via
hypercalls and RTAS methods.  However, the virtualized interface is very
similar to the underlying interrupt controller hardware, and similar PICs
exist un-virtualized in some other systems.

This patch implements both the ICP and ICS sides of the PAPR interrupt
controller.  For now, only the hypercall virtualized interface is provided,
however it would be relatively straightforward to graft an emulated
register interface onto the underlying interrupt logic if we want to add
a machine with a hardware ICS/ICP system in the future.

There are some limitations in this implementation: it is assumed for now
that only one instance of the ICS exists, although a full xics system can
have several, each responsible for a different group of hardware irqs.
ICP/ICS can handle both level-sensitve (LSI) and message signalled (MSI)
interrupt inputs.  For now, this implementation supports only MSI
interrupts, since that is used by PAPR virtual IO devices.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-04-01 18:34:56 +02:00
David Gibson
39ac845510 Implement hcall based RTAS for pSeries machines
On pSeries machines, operating systems can instantiate "RTAS" (Run-Time
Abstraction Services), a runtime component of the firmware which implements
a number of low-level, infrequently used operations.  On logical partitions
under a hypervisor, many of the RTAS functions require hypervisor
privilege.  For simplicity, therefore, hypervisor systems typically
implement the in-partition RTAS as just a tiny wrapper around a hypercall
which actually implements the various RTAS functions.

This patch implements such a hypercall based RTAS for our emulated pSeries
machine.  A tiny in-partition "firmware" calls a new hypercall, which
looks up available RTAS services in a table.

Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-04-01 18:34:56 +02:00
David Gibson
4040ab7237 Implement the bus structure for PAPR virtual IO
This extends the "pseries" (PAPR) machine to include a virtual IO bus
supporting the PAPR defined hypercall based virtual IO mechanisms.

So far only one VIO device is provided, the vty / vterm, providing
a full console (polled only, for now).

Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-04-01 18:34:55 +02:00
David Gibson
9fdf0c2995 Start implementing pSeries logical partition machine
This patch adds a "pseries" machine to qemu.  This aims to emulate a
logical partition on an IBM pSeries machine, compliant to the
"PowerPC Architecture Platform Requirements" (PAPR) document.

This initial version is quite limited, it implements a basic machine
and PAPR hypercall emulation.  So far only one hypercall is present -
H_PUT_TERM_CHAR - so that a (write-only) console is available.

Multiple CPUs are permitted, with SMP entry handled kexec() style.

The machine so far more resembles an old POWER4 style "full system
partition" rather than a modern LPAR, in that the guest manages the
page tables directly, rather than via hypercalls.

The machine requires qemu to be configured with --enable-fdt.  The
machine can (so far) only be booted with -kernel - i.e. no partition
firmware is provided.

Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-04-01 18:34:55 +02:00