Commit Graph

1006 Commits

Author SHA1 Message Date
Cédric Le Goater 967b75230b ppc/pnv: add XSCOM infrastructure
On a real POWER8 system, the Pervasive Interconnect Bus (PIB) serves
as a backbone to connect different units of the system. The host
firmware connects to the PIB through a bridge unit, the
Alter-Display-Unit (ADU), which gives him access to all the chiplets
on the PCB network (Pervasive Connect Bus), the PIB acting as the root
of this network.

XSCOM (serial communication) is the interface to the sideband bus
provided by the POWER8 pervasive unit to read and write to chiplets
resources. This is needed by the host firmware, OPAL and to a lesser
extent, Linux. This is among others how the PCI Host bridges get
configured at boot or how the LPC bus is accessed.

To represent the ADU of a real system, we introduce a specific
AddressSpace to dispatch XSCOM accesses to the targeted chiplets. The
translation of an XSCOM address into a PCB register address is
slightly different between the P9 and the P8. This is handled before
the dispatch using a 8byte alignment for all.

To customize the device tree, a QOM InterfaceClass, PnvXScomInterface,
is provided with a populate() handler. The chip populates the device
tree by simply looping on its children. Therefore, each model needing
custom nodes should not forget to declare itself as a child at
instantiation time.

Based on previous work done by :
      Benjamin Herrenschmidt <benh@kernel.crashing.org>

Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Added cpu parameter to xscom_complete()]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater d2fd9612ee ppc/pnv: add a PnvCore object
This is largy inspired by sPAPRCPUCore with some simplification, no
hotplug for instance. A set of PnvCore objects is added to the PnvChip
and the device tree is populated looping on these cores.

Real HW cpu ids are now generated depending on the chip cpu model, the
chip id and a core mask. The id is propagated to the CPU object, using
properties, to set the SPR_PIR (Processor Identification Register)

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater 631adaff31 ppc/pnv: add a PIR handler to PnvChip
The Processor Identification Register (PIR) is a register that holds a
processor identifier which is used for bus transactions (XSCOM) and
for processor differentiation in multiprocessor systems. It also used
in the interrupt vector entries (IVE) to identify the thread serving
the interrupts.

P9 and P8 have some differences in the CPU PIR encoding.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater 397a79e757 ppc/pnv: add a core mask to PnvChip
This will be used to build real HW ids for the cores and enforce some
limits on the available cores per chip.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Cédric Le Goater e997040e3f ppc/pnv: add a PnvChip object
This is is an abstraction of a POWER8 chip which is a set of cores
plus other 'units', like the pervasive unit, the interrupt controller,
the memory controller, the on-chip microcontroller, etc. The whole can
be seen as a socket. It depends on a cpu model and its characteristics:
max cores and specific inits are defined in a PnvChipClass.

We start with an near empty PnvChip with only a few cpu constants
which we will grow in the subsequent patches with the controllers
required to run the system.

The Chip CFAM (Common FRU Access Module) ID gives the model of the
chip and its version number. It is generally the first thing firmwares
fetch, available at XSCOM PCB address 0xf000f, to start initialization.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:25 +11:00
Benjamin Herrenschmidt 9e933f4a62 ppc/pnv: add skeleton PowerNV platform
The goal is to emulate a PowerNV system at the level of the skiboot
firmware, which loads the OS and provides some runtime services. Power
Systems have a lower firmware (HostBoot) that does low level system
initialization, like DRAM training. This is beyond the scope of what
qemu will address in a PowerNV guest.

No devices yet, not even an interrupt controller. Just to get started,
some RAM to load the skiboot firmware, the kernel and initrd. The
device tree is fully created in the machine reset op.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.7
      - replaced fprintf by error_report
      - used a common definition of _FDT macro
      - removed VMStateDescription as migration is not yet supported
      - added IBM Copyright statements
      - reworked kernel_filename handling
      - merged PnvSystem and sPowerNVMachineState
      - removed PHANDLE_XICP
      - added ppc_create_page_sizes_prop helper
      - removed nmi support
      - removed kvm support
      - updated powernv machine to version 2.8
      - removed chips and cpus, They will be provided in another patches
      - added a machine reset routine to initialize the device tree (also)
      - french has a squelette and english a skeleton.
      - improved commit log.
      - reworked prototypes parameters
      - added a check on the ram size (thanks to Michael Ellerman)
      - fixed chip-id cell
      - changed MAX_CPUS to 2048
      - simplified memory node creation to one node only
      - removed machine version
      - rewrote the device tree creation with the fdt "rw" routines
      - s/sPowerNVMachineState/PnvMachineState/
      - etc.]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:38:24 +11:00
Michael Roth 4bcfa56ca9 spapr_pci: advertise explicit numa IDs even when there's 1 node
With the addition of "numa_node" properties for PHBs we began
advertising NUMA affinity in cases where nb_numa_nodes > 1.

Since the default on the guest side is to make no assumptions about
PHB NUMA affinity (defaulting to -1), there is still a valid use-case
for explicitly defining a PHB's NUMA affinity even when there's just
one node. In particular, some workloads make faulty assumptions about
/sys/bus/pci/<devid>/numa_node being >= 0, warranting the use of
this property as a workaround even if there's just 1 PHB or NUMA
node.

Enable this use-case by always advertising the PHB's NUMA affinity
if "numa_node" has been explicitly set.

We could achieve this by relaxing the check to simply be
nb_numa_nodes > 0, but even safer would be to check
numa_info[nodeid].present explicitly, and to fail at start time
for cases where it does not exist.

This has an additional affect of no longer advertising PHB NUMA
affinity unconditionally if nb_numa_nodes > 1 and "numa_node"
property is unset/-1, but since the default value on the guest
side for each PHB is also -1, the behavior should be the same for
that situation. We could still retain the old behavior if desired,
but the decision seems arbitrary, so we take the simpler route.

Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Shivaprasad G. Bhat <shivapbh@in.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-28 09:36:58 +11:00
Igor Mammedov 079019f2e3 Increase MAX_CPUMASK_BITS from 255 to 288
so that it would be possible to increase maxcpus limit
for x86 target. Keep spapr/virt_arm at limit they used
to have 255.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-10-24 17:29:15 -02:00
David Gibson 357d1e3bc7 spapr: Improved placement of PCI host bridges in guest memory map
Currently, the MMIO space for accessing PCI on pseries guests begins at
1 TiB in guest address space.  Each PCI host bridge (PHB) has a 64 GiB
chunk of address space in which it places its outbound PIO and 32-bit and
64-bit MMIO windows.

This scheme as several problems:
  - It limits guest RAM to 1 TiB (though we have a limited fix for this
    now)
  - It limits the total MMIO window to 64 GiB.  This is not always enough
    for some of the large nVidia GPGPU cards
  - Putting all the windows into a single 64 GiB area means that naturally
    aligning things within there will waste more address space.
In addition there was a miscalculation in some of the defaults, which meant
that the MMIO windows for each PHB actually slightly overran the 64 GiB
region for that PHB.  We got away without nasty consequences because
the overrun fit within an unused area at the beginning of the next PHB's
region, but it's not pretty.

This patch implements a new scheme which addresses those problems, and is
also closer to what bare metal hardware and pHyp guests generally use.

Because some guest versions (including most current distro kernels) can't
access PCI MMIO above 64 TiB, we put all the PCI windows between 32 TiB and
64 TiB.  This is broken into 1 TiB chunks.  The first 1 TiB contains the
PIO (64 kiB) and 32-bit MMIO (2 GiB) windows for all of the PHBs.  Each
subsequent TiB chunk contains a naturally aligned 64-bit MMIO window for
one PHB each.

This reduces the number of allowed PHBs (without full manual configuration
of all the windows) from 256 to 31, but this should still be plenty in
practice.

We also change some of the default window sizes for manually configured
PHBs to saner values.

Finally we adjust some tests and libqos so that it correctly uses the new
default locations.  Ideally it would parse the device tree given to the
guest, but that's a more complex problem for another time.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16 12:04:15 +11:00
David Gibson daa2369903 spapr_pci: Add a 64-bit MMIO window
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
  - A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
  - A 64-bit window which maps onto a large region somewhere high in PCI
    address space (traditionally this used an identity mapping from guest
    physical address to PCI address, but that's not always the case)

The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however.  At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.

This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window.  With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.

This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured.  The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).

So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified.  This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.

For now, this only adds the possibility of 64-bit windows.  The default
configuration still uses the legacy mode.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16 12:03:09 +11:00
David Gibson 2efff1c0dd spapr: Adjust placement of PCI host bridge to allow > 1TiB RAM
Currently the default PCI host bridge for the 'pseries' machine type is
constructed with its IO windows in the 1TiB..(1TiB + 64GiB) range in
guest memory space.  This means that if > 1TiB of guest RAM is specified,
the RAM will collide with the PCI IO windows, causing serious problems.

Problems won't be obvious until guest RAM goes a bit beyond 1TiB, because
there's a little unused space at the bottom of the area reserved for PCI,
but essentially this means that > 1TiB of RAM has never worked with the
pseries machine type.

This patch fixes this by altering the placement of PHBs on large-RAM VMs.
Instead of always placing the first PHB at 1TiB, it is placed at the next
1 TiB boundary after the maximum RAM address.

Technically, this changes behaviour in a migration-breaking way for
existing machines with > 1TiB maximum memory, but since having > 1 TiB
memory was broken anyway, this seems like a reasonable trade-off.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16 12:03:09 +11:00
David Gibson 6737d9ad79 spapr_pci: Delegate placement of PCI host bridges to machine type
The 'spapr-pci-host-bridge' represents the virtual PCI host bridge (PHB)
for a PAPR guest.  Unlike on x86, it's routine on Power (both bare metal
and PAPR guests) to have numerous independent PHBs, each controlling a
separate PCI domain.

There are two ways of configuring the spapr-pci-host-bridge device: first
it can be done fully manually, specifying the locations and sizes of all
the IO windows.  This gives the most control, but is very awkward with 6
mandatory parameters.  Alternatively just an "index" can be specified
which essentially selects from an array of predefined PHB locations.
The PHB at index 0 is automatically created as the default PHB.

The current set of default locations causes some problems for guests with
large RAM (> 1 TiB) or PCI devices with very large BARs (e.g. big nVidia
GPGPU cards via VFIO).  Obviously, for migration we can only change the
locations on a new machine type, however.

This is awkward, because the placement is currently decided within the
spapr-pci-host-bridge code, so it breaks abstraction to look inside the
machine type version.

So, this patch delegates the "default mode" PHB placement from the
spapr-pci-host-bridge device back to the machine type via a public method
in sPAPRMachineClass.  It's still a bit ugly, but it's about the best we
can do.

For now, this just changes where the calculation is done.  It doesn't
change the actual location of the host bridges, or any other behaviour.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-10-16 12:03:09 +11:00
Benjamin Herrenschmidt cc706a5305 ppc/xics: Make the ICSState a list
Instead of an array of fixed sized blocks, use a list, as we will need
to have sources with variable number of interrupts. SPAPR only uses
a single entry. Native will create more. If performance becomes an
issue we can add some hashed lookup but for now this will do fine.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[ move the initialization of list to xics_common_initfn,
  restore xirr_owner after migration and move restoring to
  icp_post_load]
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[ clg: removed the icp_post_load() changes from nikunj patchset v3:
       http://patchwork.ozlabs.org/patch/646008/ ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-14 16:31:02 +11:00
Michael Roth 672de881e9 spapr: fix inheritance chain for default machine options
Rather than machine instances having backward-compatible option
defaults that need to be repeatedly re-enabled for every new machine
type we introduce, we set the defaults appropriate for newer machine
types, then add code to explicitly disable instance options as needed
to maintain compatibility with older machine types.

Currently pseries-2.5 does not inherit from pseries-2.6 in this
fashion, which is okay at the moment since we do not have any
instance compatibility options for pseries-2.6+ currently.

We will make use of this in future patches though, so fix it here.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[dwg: Extended to make 2.7 inherit from 2.8 as well]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-14 15:33:32 +11:00
Igor Mammedov 6bea1ddf8b numa: reduce code duplication by adding helper numa_get_node_for_cpu()
Replace repeated pattern

    for (i = 0; i < nb_numa_nodes; i++) {
        if (test_bit(idx, numa_info[i].node_cpu)) {
           ...
           break;

with a helper function to lookup numa node index for cpu.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-10-10 01:16:57 +03:00
Thomas Huth 3daa4a9f95 hw/ppc/spapr: Use POWER8 by default for the pseries-2.8 machine
A couple of distributors are compiling their distributions
with "-mcpu=power8" for ppc64le these days, so the user sooner
or later runs into a crash there when not explicitely specifying
the "-cpu POWER8" option to QEMU (which is currently using POWER7
for the "pseries" machine by default). Due to this reason, the
linux-user target already switched to POWER8 a while ago (see commit
de3f1b9841). Since the softmmu target
of course has the same problem, we should switch there to POWER8 for
the newer machine types, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-06 16:15:53 +11:00
Greg Kurz e17a87792d spapr: fix check of cpu alias name in spapr_get_cpu_core_type()
If the user passes an alias name and a property to -cpu, QEMU fails to
find the CPU definition and exits.

$ qemu-system-ppc64 -cpu POWER8E,compat=power7
qemu-system-ppc64: Unable to find sPAPR CPU Core definition

This happens because spapr_get_cpu_core_type() passes the full string from
the command line (i.e. "POWER8E,compat=power7") to ppc_cpu_lookup_alias(),
instead of the alias name piece only (i.e. "POWER8E").

The fix is to pass model_pieces[0] to ppc_cpu_lookup_alias().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-06 16:15:53 +11:00
Thomas Huth bac3bf287a ppc: Check the availability of transactional memory
KVM-PR currently does not support transactional memory, and the
implementation in TCG is just a fake. We should not announce TM
support in the ibm,pa-features property when running on such a
system, so disable it by default and only enable it if the KVM
implementation supports it (i.e. recent versions of KVM-HV).
These changes are based on some earlier work from Anton Blanchard
(thanks!).

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-05 11:05:28 +11:00
Thomas Huth 4cbec30d76 hw/ppc/spapr: Fix the selection of the processor features
The current code uses pa_features_206 for POWERPC_MMU_2_06, and
for everything else, it uses pa_features_207. This is bad in some
cases because there is also a "degraded" MMU version of ISA 2.06,
called POWERPC_MMU_2_06a, which should of course use the flags for
2.06 instead. And there is also the possibility that the user runs
the pseries machine with a POWER5+ or even 970 processor. In that
case we certainly do not want to set the flags for 2.07, and rather
simply skip the setting of the pa-features property instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-05 11:05:28 +11:00
Thomas Huth 230bf719d3 hw/ppc/spapr: Move code related to "ibm,pa-features" to a separate function
The function spapr_populate_cpu_dt() has become quite big
already, and since we likely have to extend the pa-features
property for every new processor generation, it is nicer
if we put the related code into a separate function.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-05 11:05:28 +11:00
David Gibson db800b21d8 pseries: Add 2.8 machine type, set up compatibility macros
Now that 2.7 is released, create the pseries-2.8 machine type and add the
boilerplate compatiblity macro stuff.  There's nothing new to put into the
2.7 compatiliby properties yet, but we'll need something eventually, so
we might as well get it ready now.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-10-05 11:05:28 +11:00
Peter Maydell c640f2849e * thread-safe tb_flush (Fred, Alex, Sergey, me, Richard, Emilio,... :-)
* license clarification for compiler.h (Felipe)
 * glib cflags improvement (Marc-André)
 * checkpatch silencing (Paolo)
 * SMRAM migration fix (Paolo)
 * Replay improvements (Pavel)
 * IOMMU notifier improvements (Peter)
 * IOAPIC now defaults to version 0x20 (Peter)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQExBAABCAAbBQJX6kKUFBxwYm9uemluaUByZWRoYXQuY29tAAoJEL/70l94x66D
 M1UIAKCQ7XfWDoClYd1TyGZ+Qj3K3TrjwLDIl/Z258euyeZ9p7PpqYQ64OCRsREJ
 fsGQOqkFYDe7gi4epJiJOuu4oAW7Xu8G6lB2RfBd7KWVMhsl3Che9AEom7amzyzh
 yoN+g9gwKfAmYwpKyjYWnlWOSjUvif6o0DaTCQCMTaAoEM3b4HKdgHfr6A2dA/E/
 47rtIVp/jNExmrZkaOjnCDS1DJ8XYT3aVeoTkuzRFQ3DBzrAiPABn6B4ExP8IBcJ
 YLFX/W8xG7F3qyXbKQOV/uYM25A55WS5B0G94ZfSlDtUGa/avzS7df9DFD/IWQT+
 RpfiyDdeJueByiTw9R0ZYxFjhd8=
 =g7xm
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* thread-safe tb_flush (Fred, Alex, Sergey, me, Richard, Emilio,... :-)
* license clarification for compiler.h (Felipe)
* glib cflags improvement (Marc-André)
* checkpatch silencing (Paolo)
* SMRAM migration fix (Paolo)
* Replay improvements (Pavel)
* IOMMU notifier improvements (Peter)
* IOAPIC now defaults to version 0x20 (Peter)

# gpg: Signature made Tue 27 Sep 2016 10:57:40 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (28 commits)
  replay: allow replay stopping and restarting
  replay: vmstate for replay module
  replay: move internal data to the structure
  cpus-common: lock-free fast path for cpu_exec_start/end
  tcg: Make tb_flush() thread safe
  cpus-common: Introduce async_safe_run_on_cpu()
  cpus-common: simplify locking for start_exclusive/end_exclusive
  cpus-common: remove redundant call to exclusive_idle()
  cpus-common: always defer async_run_on_cpu work items
  docs: include formal model for TCG exclusive sections
  cpus-common: move exclusive work infrastructure from linux-user
  cpus-common: fix uninitialized variable use in run_on_cpu
  cpus-common: move CPU work item management to common code
  cpus-common: move CPU list management to common code
  linux-user: Add qemu_cpu_is_self() and qemu_cpu_kick()
  linux-user: Use QemuMutex and QemuCond
  cpus: Rename flush_queued_work()
  cpus: Move common code out of {async_, }run_on_cpu()
  cpus: pass CPUState to run_on_cpu helpers
  build-sys: put glib_cflags in QEMU_CFLAGS
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-09-28 23:02:56 +01:00
David Gibson 4f01a63779 sysbus: Remove ignored return value of FindSysbusDeviceFunc
Functions of type FindSysbusDeviceFunc currently return an integer.
However, this return value is always ignored by the caller in
find_sysbus_device().

This changes the function type to return void, to avoid confusion over
the function semantics.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-09-27 17:03:34 -03:00
Alex Bennée e0eeb4a21a cpus: pass CPUState to run_on_cpu helpers
CPUState is a fairly common pointer to pass to these helpers. This means
if you need other arguments for the async_run_on_cpu case you end up
having to do a g_malloc to stuff additional data into the routine. For
the current users this isn't a massive deal but for MTTCG this gets
cumbersome when the only other parameter is often an address.

This adds the typedef run_on_cpu_func for helper functions which has an
explicit CPUState * passed as the first parameter. All the users of
run_on_cpu and async_run_on_cpu have had their helpers updated to use
CPUState where available.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[Sergey Fedorov:
 - eliminate more CPUState in user data;
 - remove unnecessary user data passing;
 - fix target-s390x/kvm.c and target-s390x/misc_helper.c]
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts)
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> (s390 parts)
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1470158864-17651-3-git-send-email-alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 11:57:29 +02:00
Peter Xu 5bf3d31903 memory: introduce IOMMUOps.notify_flag_changed
The new interface can be used to replace the old notify_started() and
notify_stopped(). Meanwhile it provides explicit flags so that IOMMUs
can know what kind of notifications it is requested for.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1474606948-14391-3-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-09-27 09:00:04 +02:00
Peter Maydell c229472af0 ppc patch queue 2016-09-23
This pull request supersedes ppc-for-2.8-20160922.  There was a clang
 build error in that, and I've also added one extra patch in the new pull.
 
 Included in this set of ppc and spapr patches are:
     * TCG implementations for more POWER9 instructions
     * Some preliminary XICS fixes in preparataion for the pnv machine type
     * A significant ADB (Macintosh kbd/mouse) cleanup
     * Some conversions to use trace instead of debug macros
     * Fixes to correctly handle global TLB flush synchronization in
       TCG.  This is already a bug, but it will have much more impact
       when we get MTTCG
     * Add more qtest testcases for Power
     * Some MAINTAINERS updates
     * Assorted bugfixes
     * Add the basics of NUMA associativity to the spapr PCI host bridge
 
 This touches some test files and monitor.c which are technically
 outside the ppc code, but coming through this tree because the changes
 are primarily of interest to ppc.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJX5NZnAAoJEGw4ysog2bOSoLEP/1YpRFG/6gmiT+T+Btz1QYcd
 eqrJkV63/rY/lvgZOvUBdqA/YKaBSWDOEByNFRZ+Grqz9h5zKrRcmM7IWdRWg+vG
 gyrZUm1pscFG20iGNcenxB8mD0VMk7C77gnUlv12bo+mK+1D1i8eUfKLFqxb0kOx
 JGIRQNG5orF5vZxsyjRPVpvMS9gNG90vrPIypux4ryozCVMWbrjXRZNsPQKz8wb9
 UGcJIFB6R6JVbmBGchi434PEJkcdZzP/a0HvVSO51oGsFBnwYwQ7XVc3PyA4KCD7
 tTbm6T2Rpdak3Pcd/nuzoXCMBCkh48XGKxZ+yPuLXGG5ZGIZ6rzlHPqBsEqqiLz5
 DLzbsxKyLHX2Af87js4J9OXkoNQI4rVGurvNbkQ7IMQ2/Xt97kgUEgr3W0Vj+r82
 bqIqWm4OdJ9cDzTGVlQ7l2vLv6RMe7DrkeWRNEKZZgfir7Hgj1gr79BOe96ETKBd
 7r/1z0fBkZoWSq2OdjX8RouXMwd1Nq3FnqYv2BQ99rvM/AqpkY0HYsPIfUilHq6T
 ZXhvm/4LIEev0F/GiJvV5jHHg637QS4QqdyglF8ODC8vSMvOThhL9Gj7EMgJs7hj
 Ywt1B5y88//Zq4+IGVda98J5ynOZO1CArvzoYR5UMnWiq2K0Lxpq7wemE/finyIK
 0jWLqlmCmYRzsS+oQEg/
 =et1C
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.8-20160923' into staging

ppc patch queue 2016-09-23

This pull request supersedes ppc-for-2.8-20160922.  There was a clang
build error in that, and I've also added one extra patch in the new pull.

Included in this set of ppc and spapr patches are:
    * TCG implementations for more POWER9 instructions
    * Some preliminary XICS fixes in preparataion for the pnv machine type
    * A significant ADB (Macintosh kbd/mouse) cleanup
    * Some conversions to use trace instead of debug macros
    * Fixes to correctly handle global TLB flush synchronization in
      TCG.  This is already a bug, but it will have much more impact
      when we get MTTCG
    * Add more qtest testcases for Power
    * Some MAINTAINERS updates
    * Assorted bugfixes
    * Add the basics of NUMA associativity to the spapr PCI host bridge

This touches some test files and monitor.c which are technically
outside the ppc code, but coming through this tree because the changes
are primarily of interest to ppc.

# gpg: Signature made Fri 23 Sep 2016 08:14:47 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.8-20160923: (45 commits)
  spapr_pci: Add numa node id
  monitor: fix crash for platforms without a CPU 0
  linux-user: ppc64: fix ARCH_206 bit in AT_HWCAP
  ppc/kvm: Mark 64kB page size support as disabled if not available
  ppc/xics: An ICS with offset 0 is assumed to be uninitialized
  ppc/xics: account correct irq status
  Enable H_CLEAR_MOD and H_CLEAR_REF hypercalls on KVM/PPC64.
  target-ppc: tlbie/tlbivax should have global effect
  target-ppc: add flag in check_tlb_flush()
  target-ppc: add TLB_NEED_LOCAL_FLUSH flag
  spapr: Introduce sPAPRCPUCoreClass
  target-ppc: implement darn instruction
  target-ppc: add stxsi[bh]x instruction
  target-ppc: add lxsi[bw]zx instruction
  target-ppc: add xxspltib instruction
  target-ppc: consolidate store conditional
  target-ppc: move out stqcx impementation
  target-ppc: consolidate load with reservation
  target-ppc: convert st[16,32,64]r to use new macro
  target-ppc: convert st64 to use new macro
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-09-23 14:26:12 +01:00
Fam Zheng 9c5ce8db2e vl: Switch qemu_uuid to QemuUUID
Update all qemu_uuid users as well, especially get rid of the duplicated
low level g_strdup_printf, sscanf and snprintf calls with QEMU UUID API.

Since qemu_uuid_parse is quite tangled with qemu_uuid, its switching to
QemuUUID is done here too to keep everything in sync and avoid code
churn.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-Id: <1474432046-325-10-git-send-email-famz@redhat.com>
2016-09-23 11:42:52 +08:00
Alexey Kardashevskiy 4814401fa0 spapr_pci: Add numa node id
This adds a numa id property to a PHB to allow linking passed PCI device
to CPU/memory. It is up to the management stack to do CPU/memory pinning
to the node with the actual PCI device.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[dwg: Renamed property from "node" to "numa_node" to match the similar
 one in the pxb device]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23 12:39:07 +10:00
Nathan Whitehorn 5145ad4fad Enable H_CLEAR_MOD and H_CLEAR_REF hypercalls on KVM/PPC64.
These are mandatory per PAPR and available on Linux 4.3 and newer kernels. The calls in question are required to run FreeBSD guests with reasonable performance, so enable them if possible.

Signed-off-by: Nathan Whitehorn <nwhitehorn@freebsd.org>
[dwg: Added a stub to fix compile without KVM (e.g. on x86 host)]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23 12:39:07 +10:00
Nikunj A Dadhania d76ab5e1c7 target-ppc: tlbie/tlbivax should have global effect
tlbie (BookS) and tlbivax (BookE) plus the H_CALLs(pseries) should have
a global effect.

Introduces TLB_NEED_GLOBAL_FLUSH flag. During lazy tlb flush, after
taking care of pending local flushes, check broadcast flush(at context
synchronizing event ptesync/tlbsync, etc) is needed. Depending on the
bitmask state of the tlb_need_flush, tlb is flushed from other cpus if
needed and the flags are cleared.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Use 'true' instead of '1' for call to check_tlb_flush()]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23 12:39:07 +10:00
Nikunj A Dadhania e3cffe6fad target-ppc: add flag in check_tlb_flush()
We flush the qemu TLB lazily. check_tlb_flush is called whenever we hit
a context synchronizing event or instruction that requires a pending
flush to be performed.

However, we fail to handle broadcast TLB flush operations. In order to
fix that efficiently, we want to differentiate whether check_tlb_flush()
needs to only apply pending local flushes (isync instructions,
interrupts, ...) or also global pending flush operations. The latter is
only needed when executing instructions that are defined architecturally
as synchronizing global TLB flush operations. This in our case is
ptesync on BookS and tlbsync on BookE along with the paravirtualized
hypervisor calls.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[dwg: Changed gen_check_tlb_flush() to also take a bool, and fixed
 some spelling errors in commit message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23 12:39:07 +10:00
Bharata B Rao 7ebaf79556 spapr: Introduce sPAPRCPUCoreClass
Each spapr cpu core type defines an instance_init routine which just
populates the CPU class name. This can be done in the class_init
commonly for all core types which simplifies the registration.
This is inspired by how PowerNV core types are registered.

Certain types of spapr cpu cores ('host' and generic type based on host
CPU) are initialized in target-ppc/kvm.c. To convert these type
registrations to use class_init, we need to expose
spapr_cpu_core_class_init() outside of spapr_cpu_core.c.

Commit d11b268e17 added a generic sPAPR CPU core family
type to support cases like POWER8 CPU type on POWER8E host CPU.
Switching to class_init would fix such scenarios to use the right
CPU thread type instead of defaulting to host-powerpc64-cpu.

In an unrelated cleanup, fix a typo in .get_hotplug_handler routine.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23 12:39:06 +10:00
Laurent Vivier 7ab6a501c6 spapr_vio: convert to trace framework instead of DPRINTF
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23 10:29:40 +10:00
Laurent Vivier 028ec3cee3 spapr_rtas: convert to trace framework instead of DPRINTF
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23 10:29:40 +10:00
Laurent Vivier 24ac7755d7 spapr_drc: convert to trace framework instead of DPRINTF
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23 10:29:40 +10:00
Laurent Vivier eeddd59f59 tests: add RTAS command in the protocol
Add a first test to validate the protocol:

- rtas/get-time-of-day compares the time
  from the guest with the time from the host.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-23 10:29:40 +10:00
Ladi Prosek d4b84d564e Remove unused function declarations
Unused function declarations were found using a simple gcc plugin and
manually verified by grepping the sources.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-09-15 15:32:22 +03:00
Cédric Le Goater 3654fa95bc hw/ppc: add a ppc_create_page_sizes_prop() helper routine
The exact same routine will be used in PowerNV.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-07 12:40:12 +10:00
Cédric Le Goater ce9863b797 hw/ppc: use error_report instead of fprintf
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-07 12:40:12 +10:00
Cédric Le Goater 7804c353a9 hw/ppc: include fdt helper routine in a common file
spapr_pci would also be a good candidate but the macro _FDT is
slightly different. It returns and does not exit.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-09-07 09:52:14 +10:00
Peter Maydell f3b9e787ae ppc patch queue for 2016-08-15
Just a single patch here, I hope this is the last ppc / spapr fix to
 squeeze into qemu-2.7.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXsWVMAAoJEGw4ysog2bOSxpkQAKCybBBMbQ6viEeqZBNtrleC
 whKm6WhN5AZvxb1W/NzacrpwXPHCM8C9+jZRIpea3ucHn5ijyRPCE73gBZLcyV6h
 CRFisJQ2NT9gq4iCw0Iw1TwxL+tt6xw2dPr3+mKQpJuUHbcKK8hO5EhZLe/dr+u7
 54j2l+EgqhokTjLJuD7GEa/qca1qSsae/Q0HvIThcA4h4jX5RtpMHNSpbh6PJ8fI
 dxlcHnjtfei75ptMMqrP+YZ+HPEuiqOqLSVKmcEsjJblKABk7SW7RjbW4Jk8dKYo
 Z8VA+MOP+eLrbjYOPJHROHK80Ik6hg3NH/4/tduZM0hsOeFV2i9AyMR1n/Qhkpyu
 xEi8Ld+wcVun8NFWV2dj/m/RAE/BgZ1non3wddxVIog8W2R/+PMIfMdVOWt3pRMj
 KS/1kkCzKYHWFO18FTpxGfFLsdiNo1szjtJydjfAGd5RvectDm6bBguz0ZwgDPSo
 338I7uIFB7h4L/DwMFcPSYTRTSyrvE5MsxcwpQoS4OB5ZKrKGLrqLG9cy0XvO9sO
 ImHRMT/YMnD9qiXXnuzmHCg8XgRPyfbxdml6EkxcIDJn9wsINDRdvN9GZ33vDUgT
 CBy7xqxRlYJ+MXFJP5S6dyzM6mqtwy8MFDqlcDvIzNDl5GEAyVJHjQdtUu/t3cRx
 OzQ0bArG7WeIK2norvwL
 =Jm4E
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160815' into staging

ppc patch queue for 2016-08-15

Just a single patch here, I hope this is the last ppc / spapr fix to
squeeze into qemu-2.7.

# gpg: Signature made Mon 15 Aug 2016 07:46:36 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160815:
  ppc: parse cpu features once

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-08-15 21:48:03 +01:00
Greg Kurz e703d2f71c ppc: parse cpu features once
Considering that features are converted to global properties and
global properties are automatically applied to every new instance
of created CPU (at object_new() time), there is no point in
parsing cpu_model string every time a CPU created. So move
parsing outside CPU creation loop and do it only once.

Parsing also should be done before any CPU is created so that
features would affect the first CPU a well.

This patch does that for all PowerPC machine types.

It is based on previous work from Bharata:

https://lists.nongnu.org/archive/html/qemu-devel/2016-06/msg07564.html

Signed-off-by: Greg Kurz <groug@kaod.org>
[clg: only kept the fix for the spapr platform. support for other
      platform will be added in 2.8 ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-08-13 17:32:58 +10:00
Laurent Vivier e723b87103 trace-events: fix first line comment in trace-events
Documentation is docs/tracing.txt instead of docs/trace-events.txt.

find . -name trace-events -exec \
     sed -i "s?See docs/trace-events.txt for syntax documentation.?See docs/tracing.txt for syntax documentation.?" \
     {} \;

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-id: 1470669081-17860-1-git-send-email-lvivier@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-08-12 10:36:01 +01:00
Thomas Huth 4babfaf05d hw/ppc/spapr: Look up CPU alias names instead of hard-coding the aliases
Hard-coding the CPU alias names in the spapr_cores[] array has
two big disadvantages:

1) We register a real type with the CPU alias name in
   spapr_cpu_core_register_types() - this prevents us from registering
   a CPU family name in kvm_ppc_register_host_cpu_type() with the same
   name (as we do it for the non-hotpluggable CPU types).

2) It's quite cumbersome to maintain the aliases here in sync with the
   ppc_cpu_aliases list from target-ppc/cpu-models.c.

So let's simply add proper alias lookup to the spapr cpu core code,
too (by checking whether the given model can be used directly, and
if not by trying to look up the given model as an alias name instead).

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-08-10 13:12:20 +10:00
Cédric Le Goater caebf37859 spapr: remove extra type variable
The sPAPR CPU core typename is already available in the upper
block. Let's use it and move the check upward also.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-08-10 13:12:20 +10:00
Peter Maydell f5edfcfafb Error reporting patches for 2016-08-08
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXqDFpAAoJEDhwtADrkYZTlQMQALZDzjoYJQlmcLQu92O52a3L
 XlluF82W4Y6jOLR6u/eRsP4uok/C3FA23SMtw7CfPLJZbet/PDKLS4N7J0m4mrqa
 mGmBT/9ZY7jVeISJz4X7WW7chgFR0JF2rOUpEmQPvzrEYYY7cTd4DwHpb0UB1f7W
 /H3i55vkVUCpSeib8Ah/MgzYGdgv1ZVmh0X+IsEwd42J8f4nv8y3YSPO8J/DPooY
 hfHVikObX/LIx1yItFkKWzA2JW+nSLvBMXYtbvVUkVkDXwQYcHJcAKhYPzdiE6Iy
 GTSrnwXCW/4ckic/AumZ1WNTbcK5tp9FtdI/li4JzZHoJ/pWo0lt+BWCTmQOFCvs
 f0Vqza5Ux3B+hvCYM+ulmydnEGZVopc51u8cqEKGzYE2VrxJ0A63lqWCzm5F9gQj
 cE/546oiTa9pm4DDTfB064+Chzq1ao4AWga2yol7IWBvljkQZ7j+I620l5xv5Xaa
 WLhIDZg4e6EwViNtta73Fo3y8HqlvHTiPh3Gpfgvrnc7hocL7im3yh8O1RSOUCdY
 4aUmWonDg4zKPb2u9nkerWBCDM4s0p5rNTYmntJtoVIlsFvcUm/3yzVipdWyz5AX
 y9xLc3FqVfE2Kfw1qJHlw5fx7FegFJCfGzsa1xBZfL1qC9bfU1XGqj4fnyIbQ8pE
 WWrWL7bGuzSWZsQ2+JBT
 =FNBu
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2016-08-08' into staging

Error reporting patches for 2016-08-08

# gpg: Signature made Mon 08 Aug 2016 08:14:49 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-error-2016-08-08:
  error: Fix error_printf() calls lacking newlines
  vfio: Use error_report() instead of error_printf() for errors
  checkpatch: Fix newline detection in error_setg() & friends
  error: Strip trailing '\n' from error string arguments (again)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-08-08 13:25:35 +01:00
Markus Armbruster df3c286c53 error: Strip trailing '\n' from error string arguments (again)
Commit 9af9e0f, 6daf194d, be62a2eb and 312fd5f got rid of a bunch, but
they keep coming back.  checkpatch.pl tries to flag them since commit
5d596c2, but it's not very good at it.  Offenders tracked down with
Coccinelle script scripts/coccinelle/err-bad-newline.cocci, an updated
version of the script from commit 312fd5f.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1470224274-31522-2-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-08-08 09:00:44 +02:00
David Gibson 57c0eb1e0d spapr: Fix undefined behaviour in spapr_tce_reset()
When a TCE table (sPAPR IOMMU context) is in disabled state (which is true
by default for the 64-bit window), it has tcet->nb_table == 0 and
tcet->table == NULL.  However, on system reset, spapr_tce_reset() executes,
which unconditionally calls
        memset(tcet->table, 0, table_size);

We get away with this in practice, because it's a zero length memset(),
but memset() on a NULL pointer is undefined behaviour, so we should not
call it in this case.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-08-08 10:06:25 +10:00
David Gibson 3c0c47e346 spapr: Correctly set query_hotpluggable_cpus hook based on machine version
Prior to c8721d3 "spapr: Error out when CPU hotplug is attempted on older
pseries machines", attempting to use query-hotpluggable-cpus on pseries-2.6
and earlier machine types would SEGV.

That change fixed that, but due to some unexpected interactions in init
order and a brown-paper-bag worthy failure to test, it accidentally
disabled query-hotpluggable-cpus for all pseries machine types, including
the current one which should allow it.

In fact, query_hotpluggable_cpus needs to be non-NULL when and only when
the dr_cpu_enabled flag in sPAPRMachineClass is set, which makes
dr_cpu_enabled itself redundant.

This patch removes dr_cpu_enabled, instead directly setting
query_hotpluggable_cpus from the machine class_init functions, and using
that to determine the availability of CPU hotplug when necessary.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-08-08 09:45:03 +10:00
Bharata B Rao c8721d3599 spapr: Error out when CPU hotplug is attempted on older pseries machines
CPU hotplug and coldplug aren't supported prior to pseries-2.7.  Further,
earlier machine types don't use CPU core objects at all.  These mean that
query-hotpluggable-cpus and coldplug on older pseries machines will crash
QEMU.  It also means that hotpluggable_cpus flag in query-machines will
be incorrectly set to true for pseries < 2.7, since it is based on the
presence of the query_hotpluggable_cpus hook.

- Don't assign the query_hotpluggable_cpus hook for pseries < 2.7
- query_hotpluggable_cpus should therefore never be called on pseries <
  2.7, so add an assert
- spapr_core_pre_plug() should fail hot/cold plug attempts for pseries <
  2.7, since core objects are never used there
- spapr_core_plug() should therefore never be called for pseries < 2.7, so
  add an assert.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
[dwg: Change from query_hotpluggable_cpus returning NULL for pseries < 2.7
 to not being called at all, reword commit message for accuracy]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-08-03 13:08:54 +10:00
Bharata B Rao 62be8b044a spapr: Prevent boot CPU core removal
Boot CPU is assumed to be always present in QEMU code. So
until that assumptions are gone, deny removal request.
In another words, QEMU won't support boot CPU core hot-unplug.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
[dwg: Tweaked error message for clarity]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-29 12:02:31 +10:00
David Gibson 7cdd76132a Revert "spapr: Ensure CPU cores are added contiguously and removed in LIFO order"
This reverts commit 5cbc64de25.

Now that we have stable cpu_index values for pseries-2.7 (and future)
machine types, we can now safely allow hotplug and unplug in any order.

Conflicts:
	hw/ppc/spapr_cpu_core.c

Some conflicts on revert due to some small changes in the inserted
code since the original commit.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-29 12:02:31 +10:00
Igor Mammedov b63578bdb5 spapr: init CPUState->cpu_index with index relative to core-id
It will enshure that cpu_index for a given cpu stays the same
regardless of the order cpus has been created/deleted and so
it would be possible to migrate QEMU instance with out of order
created CPU.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-29 12:02:31 +10:00
Greg Kurz 12bf2d33fe spapr: disintricate core-id from DT semantics
The goal of this patch is to have a stable core-id which does not depend
on any DT related semantics, which involve non-obvious computations on
modern PowerPC server cpus.

With this patch, the DT core id is computed on-demand as:

       (core-id / smp_threads) * smt

where smt is the number of threads per core in the host.

This formula should be consolidated in a helper since it is needed in
several places.

Other uses for core-id includes: compute a stable cpu_index (which
allows random order hotplug/unplug without breaking migration) and
NUMA.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-25 15:43:41 +10:00
Thomas Huth c573fc03da hw/ppc/spapr: Make sure to close the htab_fd when migration is canceled
When canceling a migration process, we currently do not close the
HTAB migration file descriptor since htab_save_complete() is never
called in that case. So we leave the migration process with a
dangling htab_fd value around, and this causes any further migration
attempts to fail. To fix this issue, simply make sure that the
htab_fd is closed during the migration cleanup stage. And since the
cleanup() function is also called when migration succeeds, we can
also remove the call to close_htab_fd() from the htab_save_complete()
function.

Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1354341
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-25 10:19:30 +10:00
Bharata B Rao 5cbc64de25 spapr: Ensure CPU cores are added contiguously and removed in LIFO order
If CPU core addition or removal is allowed in random order leading to
holes in the core id range (and hence in the cpu_index range), migration
can fail as migration with holes in cpu_index range isn't yet handled
correctly.

Prevent this situation by enforcing the addition in contiguous order
and removal in LIFO order so that we never end up with holes in
cpu_index range.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-18 10:40:27 +10:00
Greg Kurz 44d691f7d9 spapr: fix core unplug crash
If the host has 8 threads/core and the guest is started with:

-smp cores=1,threads=4,maxcpus=12

It is possible to crash QEMU by doing:

(qemu) device_add host-spapr-cpu-core,core-id=16,id=foo
(qemu) device_del foo
Segmentation fault

This happens because spapr_core_unplug() assumes cpu_dt_id == core_id.
As long as cpu_dt_id is derived from the non-table cpu_index, this is
only true when you plug cores with contiguous ids.

It is safer to be consistent: the DR connector was created with an
index that is immediately written to cc->core_id, and spapr_core_plug()
also relies on cc->core_id.

Let's use it also in spapr_core_unplug().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-18 10:40:27 +10:00
Markus Armbruster 2a6a4076e1 Clean up ill-advised or unusual header guards
Cleaned up with scripts/clean-header-guards.pl.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12 16:20:46 +02:00
Markus Armbruster 121d07125b Clean up header guards that don't match their file name
Header guard symbols should match their file name to make guard
collisions less likely.  Offenders found with
scripts/clean-header-guards.pl -vn.

Cleaned up with scripts/clean-header-guards.pl, followed by some
renaming of new guard symbols picked by the script to better ones.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12 16:19:16 +02:00
Markus Armbruster a9c94277f0 Use #include "..." for our own headers, <...> for others
Tracked down with an ugly, brittle and probably buggy Perl script.

Also move includes converted to <...> up so they get included before
ours where that's obviously okay.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-07-12 16:19:16 +02:00
Eric Blake 1158bb2a05 qapi: Add parameter to visit_end_*
Rather than making the dealloc visitor track of stack of pointers
remembered during visit_start_* in order to free them during
visit_end_*, it's a lot easier to just make all callers pass the
same pointer to visit_end_*.  The generated code has access to the
same pointer, while all other users are doing virtual walks and
can pass NULL.  The dealloc visitor is then greatly simplified.

All three visit_end_*() functions intentionally take a void**,
even though the visit_start_*() functions differ between void**,
GenericList**, and GenericAlternate**.  This is done for several
reasons: when doing a virtual walk, passing NULL doesn't care
what the type is, but when doing a generated walk, we already
have to cast the caller's specific FOO* to call visit_start,
while using void** lets us use visit_end without a cast. Also,
an upcoming patch will add a clone visitor that wants to use
the same implementation for all three visit_end callbacks,
which is made easier if all three share the same signature.

For visitors with already track per-object state (the QMP visitors
via a stack, and the string visitors which do not allow nesting),
add an assertion that the caller is indeed passing the same
pointer to paired calls.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465490926-28625-4-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-07-06 10:52:04 +02:00
Peter Maydell 791b7d2340 pc, pci, virtio: new features, cleanups, fixes
iommus can not be added with -device.
 cleanups and fixes all over the place
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJXe4l4AAoJECgfDbjSjVRpIz4IALye7mKG61/POA4Gqmhalc3d
 HnlNSZ2YcKAuvPg7WWkBuRacrQvVY/MbW1mLloG1lY0tdFgZG8Cy+CY6wJg1NE4c
 cXd+77vHkIyrnl+Nil+QOgTFiAsMnD+mXHHsnCDw2jGn3JbgVNuCMi7V34fGkQd2
 PDkZyYfwTqO3HytuG0/j2Somc9du1gjYdn+9qigfZVgP96jGDojBuJWuuU5flCB3
 Kj5xrOuI01XlbdTk71tVjBJBektQurWr6r7GECDqZIpUfc+BI70FU9jPh+OlLTO/
 92yi29ncjyStz4tRnf18xoQ8uBgH/tD1xigEUPRtnm1+0i/tgONBL8cAdBF9FBE=
 =ABGE
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc, pci, virtio: new features, cleanups, fixes

iommus can not be added with -device.
cleanups and fixes all over the place

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 05 Jul 2016 11:18:32 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (30 commits)
  vmw_pvscsi: remove unnecessary internal msi state flag
  e1000e: remove unnecessary internal msi state flag
  vmxnet3: remove unnecessary internal msi state flag
  mptsas: remove unnecessary internal msi state flag
  megasas: remove unnecessary megasas_use_msi()
  pci: Convert msi_init() to Error and fix callers to check it
  pci bridge dev: change msi property type
  megasas: change msi/msix property type
  mptsas: change msi property type
  intel-hda: change msi property type
  usb xhci: change msi/msix property type
  change pvscsi_init_msi() type to void
  tests: add APIC.cphp and DSDT.cphp blobs
  tests: acpi: add CPU hotplug testcase
  log: Permit -dfilter 0..0xffffffffffffffff
  range: Replace internal representation of Range
  range: Eliminate direct Range member access
  log: Clean up misuse of Range for -dfilter
  pci_register_bar: cleanup
  Revert "virtio-net: unbreak self announcement and guest offloads after migration"
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-05 16:48:24 +01:00
Benjamin Herrenschmidt 912acdf487 ppc/hash64: Add proper real mode translation support
This adds proper support for translating real mode addresses based
on the combination of HV and LPCR bits. This handles HRMOR offset
for hypervisor real mode, and both RMA and VRMA modes for guest
real mode. PAPR mode adjusts the offsets appropriately to match the
RMA used in TCG, but we need to limit to the max supported by the
implementation (16G).

This includes some fixes by Cédric Le Goater <clg@kaod.org>

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[dwg: Adjusted for differences in my version of the prereq patches]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05 14:31:08 +10:00
Cédric Le Goater 1f0252e66e ppc: simplify ppc_hash64_hpte_page_shift_noslb()
The segment page shift parameter is never used. Let's remove it.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05 14:31:08 +10:00
Alexey Kardashevskiy ae4de14cd3 spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW)
This adds support for Dynamic DMA Windows (DDW) option defined by
the SPAPR specification which allows to have additional DMA window(s)

The "ddw" property is enabled by default on a PHB but for compatibility
the pseries-2.6 machine and older disable it.
This also creates a single DMA window for the older machines to
maintain backward migration.

This implements DDW for PHB with emulated and VFIO devices. The host
kernel support is required. The advertised IOMMU page sizes are 4K and
64K; 16M pages are supported but not advertised by default, in order to
enable them, the user has to specify "pgsz" property for PHB and
enable huge pages for RAM.

The existing linux guests try creating one additional huge DMA window
with 64K or 16MB pages and map the entire guest RAM to. If succeeded,
the guest switches to dma_direct_ops and never calls TCE hypercalls
(H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM
and not waste time on map/unmap later. This adds a "dma64_win_addr"
property which is a bus address for the 64bit window and by default
set to 0x800.0000.0000.0000 as this is what the modern POWER8 hardware
uses and this allows having emulated and VFIO devices on the same bus.

This adds 4 RTAS handlers:
* ibm,query-pe-dma-window
* ibm,create-pe-dma-window
* ibm,remove-pe-dma-window
* ibm,reset-pe-dma-window
These are registered from type_init() callback.

These RTAS handlers are implemented in a separate file to avoid polluting
spapr_iommu.c with PCI.

This changes sPAPRPHBState::dma_liobn to an array to allow 2 LIOBNs
and updates all references to dma_liobn. However this does not add
64bit LIOBN to the migration stream as in fact even 32bit LIOBN is
rather pointless there (as it is a PHB property and the management
software can/should pass LIOBNs via CLI) but we keep it for the backward
migration support.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05 14:31:08 +10:00
Alexey Kardashevskiy 606b54986d spapr_iommu: Realloc guest visible TCE table when starting/stopping listening
The sPAPR TCE tables manage 2 copies when VFIO is using an IOMMU -
a guest view of the table and a hardware TCE table. If there is no VFIO
presense in the address space, then just the guest view is used, if
this is the case, it is allocated in the KVM. However since there is no
support yet for VFIO in KVM TCE hypercalls, when we start using VFIO,
we need to move the guest view from KVM to the userspace; and we need
to do this for every IOMMU on a bus with VFIO devices.

This implements the callbacks for the sPAPR IOMMU - notify_started()
reallocated the guest view to the user space, notify_stopped() does
the opposite.

This removes explicit spapr_tce_set_need_vfio() call from PCI hotplug
path as the new callbacks do this better - they notify IOMMU at
the exact moment when the configuration is changed, and this also
includes the case of PCI hot unplug.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05 10:43:02 +10:00
Bharata B Rao 7093645a84 spapr: Ensure thread0 of CPU core is always realized first
During CPU core realization, we create all the thread objects and parent
them to the core object in a loop. However, the realization of thread
objects is done separately by walking the threads of a core using
object_child_foreach(). With this, there is no guarantee on the order
in which the child thread objects get realized. Since CPU device tree
properties are currently derived from the CPU thread object, we assume
thread0 of the core to be the representative thread of the core when
creating device tree properties for the core. If thread0 is not the
first thread that gets realized, then we would end up having an
incorrect dt_id for the core and this causes hotplug failures from
the guest.

Fix this by realizing each thread object by walking the core's thread
object list thereby ensuring that thread0 and other threads are always
realized in the correct order.

Future TODO: CPU DT nodes are per-core properties and we should
ideally base the creation of CPU DT nodes on core objects rather than
the thread objects.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-05 10:43:02 +10:00
Marcel Apfelbaum 1b04cc801a hw/ppc: realize the PCI root bus as part of mac99 init
Mac99's PCI root bus is not part of a host bridge,
realize it manually.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-04 14:50:01 +03:00
Greg Kurz 8a1eb71bd8 spapr: drop duplicate variable in spapr_core_release()
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Greg Kurz f11235b920 spapr: do proper error propagation in spapr_cpu_core_realize_child()
This patch changes spapr_cpu_core_realize_child() to have a local error
pointer and use error_propagate() as it is supposed to be done.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Greg Kurz 8e758dee66 spapr: drop reference on child object during core realization
When a core is being realized, we create a child object for each thread
of the core.

The child is first initialized with object_initialize() which sets its ref
count to 1, and then added to the core with object_property_add_child()
which bumps the ref count to 2.

When the core gets released, object_unparent() decreases the ref count to 1,
and we g_free() the object: we hence loose the reference on an unfinalized
object. This is likely to cause random crashes.

Let's drop the extra reference as soon as we don't need it, after the
thread is added to the core.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Bharata B Rao 470f215787 spapr: Restore support for 970MP and POWER8NVL CPU cores
Introduction of core based CPU hotplug for PowerPC sPAPR didn't
add support for 970MP and POWER8NVL based core types. Add support for
the same.

While we are here, add support for explicit specification of POWER5+_v2.1
core type.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Benjamin Herrenschmidt 27f2458245 ppc/xics: Replace "icp" with "xics" in most places
The "ICP" is a different object than the "XICS". For historical reasons,
we have a number of places where we name a variable "icp" while it contains
a XICSState pointer. There *is* an ICPState structure too so this makes
the code really confusing.

This is a mechanical replacement of all those instances to use the name
"xics" instead. There should be no functional change.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[spapr_cpu_init has been moved to spapr_cpu_core.c, change there]
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Benjamin Herrenschmidt 161deaf225 ppc/xics: Rename existing xics to xics_spapr
The common class doesn't change, the KVM one is sPAPR specific. Rename
variables and functions to xics_spapr.

Retain the type name as "xics" to preserve migration for existing sPAPR
guests.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:46 +10:00
Aaron Larson a36848ff7c target-ppc: Eliminate redundant and incorrect function booke206_page_size_to_tlb
Eliminate redundant and incorrect booke206_page_size_to_tlb function
from ppce500_spin.c in preference to previously existing but newly
exported definition from e500.c

Defect analysis:

The booke206_page_size_to_tlb function in e500.c was updated in commit
2bd9543 "ppc: booke206: use MAV=2.0 TSIZE definition, fix 4G pages" to
reflect a change in the definition of MAS1_TSIZE_SHIFT from 8
(corresponding to a min TLB page size of 4kb) to a value of 7 (TLB
page size 2k).  The booke206_page_size_to_tlb() function defined in
ppce500_spin.c was never updated to reflect the change in
MAS1_TSIZE_SHIFT.

In http://lists.nongnu.org/archive/html/qemu-ppc/2016-06/msg00533.html,
Scott Wood suggested this "root cause" explanation:

SW> The patch that changed MAS1_TSIZE_SHIFT from 8 to 7 was around the
SW> same time as the patch that added this code, which is probably why
SW> adjusting it got missed.  Commit 2bd9543cd3 did update the
SW> equivalent code in ppce500_mpc8544ds.c, which now resides in
SW> hw/ppc/e500.c and has been changed to not assume a power-of-2
SW> size.  The ppce500_spin version should be eliminated.

Signed-off-by: Aaron Larson <alarson@ddci.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Bharata B Rao ff461b8da9 spapr: Restore support for older PowerPC CPU cores
Introduction of core based CPU hotplug for PowerPC sPAPR didn't
add support for 970 and POWER5+ based core types. Add support for
the same.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Greg Kurz dde35bc966 spapr: fix write-past-end-of-array error in cpu core device init code
This fixes a potential QEMU crash introduced by commit 3b54254966.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Thomas Huth 6cc09e261b hw/ppc/spapr: Add some missing hcall function set strings
Add "hcall-sprg0" (for H_SET_SPRG0), "hcall-copy" (for H_PAGE_INIT)
and "hcall-debug" (for H_LOGICAL_CI_LOAD/STORE) to the property
"ibm,hypertas-functions" to indicate that we support these hypercalls.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Benjamin Herrenschmidt 4b236b621b ppc: Initial HDEC support
The current behaviour isn't completely right, as for the DEC, we
don't properly re-arm when wrapping around, but I will fix this
in a separate patch.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: fixed checkpatch.pl errors ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Peter Krempa 27393c33d8 qapi: keep names in 'CpuInstanceProperties' in sync with struct CPUCore
struct CPUCore uses 'id' suffix in the property name. As docs for
query-hotpluggable-cpus state that the cpu core properties should be
passed back to device_add by management in case new members are added
and thus the names for the fields should be kept in sync.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
[dwg: Removed a duplicated word in comment]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-27 13:15:06 +10:00
Aaron Larson 6d18a7a1ff target-ppc: ppce500_spin.c uses SPR_PIR, should use SPR_BOOKE_PIR
ppce500_spin.c uses SPR_PIR to initialize the spin table, however on
Book E processors the correct SPR is SPR_BOOKE_PIR.

Signed-off-by: Aaron Larson <alarson@ddci.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-27 13:12:22 +10:00
Alexey Kardashevskiy f682e9c244 memory: Add reporting of supported page sizes
Every IOMMU has some granularity which MemoryRegionIOMMUOps::translate
uses when translating, however this information is not available outside
the translate context for various checks.

This adds a get_min_page_size callback to MemoryRegionIOMMUOps and
a wrapper for it so IOMMU users (such as VFIO) can know the minimum
actual page size supported by an IOMMU.

As IOMMU MR represents a guest IOMMU, this uses TARGET_PAGE_SIZE
as fallback.

This removes vfio_container_granularity() and uses new helper in
memory_region_iommu_replay() when replaying IOMMU mappings on added
IOMMU memory region.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
[dwg: Removed an unnecessary calculation]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 11:13:09 +10:00
Aneesh Kumar K.V c117590769 powerpc/mm: Update the WIMG check during H_ENTER
Support for 0 value for memeory coherence is optional and with ppc64
we can always enable memory coherence. Linux kernel did that during
the development of 4.7 kernel. But that resulted in failure in Qemu
in H_ENTER hcall due to below check. The mentioned change was reverted
in the kernel and kernel right now enable memory coherence only if
cache inhibited is not set. Nevertheless update qemu WIMG flag check
to cover the case where we enable memory coherence along with cache
inhibited flag.

In order to handle older and newer kernel version consider both Cache
inhibitted and (cache inhibitted | memory conference) as valid values
for wimg flags.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 11:12:17 +10:00
Peter Maydell b0ad00b8c9 -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
 
 iQEcBAABAgAGBQJXaFInAAoJEJykq7OBq3PI6VsH/0Sfgbdo1RksYuQwb/y92sCW
 EN+lxUZ+OLfgrc8PYgNZwfSM3rsfYhznL0MAXOeEe7Ahabi07w7DhGR8WvwfAOlI
 G96FRuvrIPfv5u6U6fwS4CvG3TIHVLxfHKCsTpPUmH8U5CNx/x/tpjNiWN1dj6t+
 sXybSjYHfZfiZy2tI9MFIFWCdxnF/pl0QAPhbRqc8Y/RQTDrPKRjLpz+nitN/u96
 5TS7KlELyQuP91YMmLceYSmIkHbxW703h+iE2n4hov0uZCP8Jil+2Jsd3ziQSRlL
 j6LqexQ2ViBGdDSfiZGYES2VPlsHOCwb4G+IgWBStfZg1ppaXENvcDzPrgrB+L4=
 =eUnF
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging

# gpg: Signature made Mon 20 Jun 2016 21:29:27 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request: (42 commits)
  trace: split out trace events for linux-user/ directory
  trace: split out trace events for qom/ directory
  trace: split out trace events for target-ppc/ directory
  trace: split out trace events for target-s390x/ directory
  trace: split out trace events for target-sparc/ directory
  trace: split out trace events for net/ directory
  trace: split out trace events for audio/ directory
  trace: split out trace events for ui/ directory
  trace: split out trace events for hw/alpha/ directory
  trace: split out trace events for hw/arm/ directory
  trace: split out trace events for hw/acpi/ directory
  trace: split out trace events for hw/vfio/ directory
  trace: split out trace events for hw/s390x/ directory
  trace: split out trace events for hw/pci/ directory
  trace: split out trace events for hw/ppc/ directory
  trace: split out trace events for hw/9pfs/ directory
  trace: split out trace events for hw/i386/ directory
  trace: split out trace events for hw/isa/ directory
  trace: split out trace events for hw/sd/ directory
  trace: split out trace events for hw/sparc/ directory
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-20 22:30:34 +01:00
Daniel P. Berrange 3054fba87b trace: split out trace events for hw/ppc/ directory
Move all trace-events for files in the hw/ppc/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1466066426-16657-27-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Eduardo Habkost 9be385980d coccinelle: Remove unnecessary variables for function return value
Use Coccinelle script to replace 'ret = E; return ret' with
'return E'. The script will do the substitution only when the
function return type and variable type are the same.

Manual fixups:

* audio/audio.c: coding style of "read (...)" and "write (...)"
* block/qcow2-cluster.c: wrap line to make it shorter
* block/qcow2-refcount.c: change indentation of wrapped line
* target-tricore/op_helper.c: fix coding style of
  "remainder|quotient"
* target-mips/dsp_helper.c: reverted changes because I don't
  want to argue about checkpatch.pl
* ui/qemu-pixman.c: fix line indentation
* block/rbd.c: restore blank line between declarations and
  statements

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1465855078-19435-4-git-send-email-ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Unused Coccinelle rule name dropped along with a redundant comment;
whitespace touched up in block/qcow2-cluster.c; stale commit message
paragraph deleted]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-20 16:38:13 +02:00
Igor Mammedov 2474bfd460 spapr: implement query-hotpluggable-cpus callback
It returns a list of present/possible to hotplug CPU
objects with a list of properties to use with
device_add.

in spapr case returned list would looks like:
-> { "execute": "query-hotpluggable-cpus" }
<- {"return": [
     { "props": { "core": 8 }, "type": "POWER8-spapr-cpu-core",
       "vcpus-count": 2 },
     { "props": { "core": 0 }, "type": "POWER8-spapr-cpu-core",
       "vcpus-count": 2,
       "qom-path": "/machine/unattached/device[0]"}
   ]}'

TODO:
  add 'node' property for core <-> numa node mapping

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:49 +10:00
Bharata B Rao 6f4b5c3ec5 spapr: CPU hot unplug support
Remove the CPU core device by removing the underlying CPU thread devices.
Hot removal of CPU for sPAPR guests is achieved by sending the hot unplug
notification to the guest. Release the vCPU object after CPU hot unplug so
that vCPU fd can be parked and reused.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:49 +10:00
Bharata B Rao af81cf323c spapr: CPU hotplug support
Set up device tree entries for the hotplugged CPU core and use the
exising RTAS event logging infrastructure to send CPU hotplug notification
to the guest.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:49 +10:00
Bharata B Rao 94a94e4c49 spapr: convert boot CPUs into CPU core devices
Introduce sPAPRMachineClass.dr_cpu_enabled to indicate support for
CPU core hotplug. Initialize boot time CPUs as core deivces and prevent
topologies that result in partially filled cores. Both of these are done
only if CPU core hotplug is supported.

Note: An unrelated change in the call to xics_system_init() is done
in this patch as it makes sense to use the local variable smt introduced
in this patch instead of kvmppc_smt_threads() call here.

TODO: We derive sPAPR core type by looking at -cpu <model>. However
we don't take care of "compat=" feature yet for boot time as well
as hotplug CPUs.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:49 +10:00
Bharata B Rao afd10a0fa6 spapr: Move spapr_cpu_init() to spapr_cpu_core.c
Start consolidating CPU init related routines in spapr_cpu_core.c. As
part of this, move spapr_cpu_init() and its dependencies from spapr.c
to spapr_cpu_core.c

No functionality change in this patch.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
[dwg: Rename TIMEBASE_FREQ to SPAPR_TIMEBASE_FREQ, since it's now in a
 public(ish) header]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:48 +10:00
Bharata B Rao 3b54254966 spapr: Abstract CPU core device and type specific core devices
Add sPAPR specific abastract CPU core device that is based on generic
CPU core device. Use this as base type to create sPAPR CPU specific core
devices.

TODO:
- Add core types for other remaining CPU types
- Handle CPU model alias correctly

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:48 +10:00
Bharata B Rao aab99135b6 spapr_drc: Prevent detach racing against attach for CPU DR
If a CPU is hot removed while hotplug of the same is still in progress,
the guest crashes. Prevent this by ensuring that detach is done only
after attach has completed.

The existing code already prevents such race for PCI hotplug. However
given that CPU is a logical DR unlike PCI and starts with ISOLATED
state, we need a logic that works for CPU too.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
               [Don't set awaiting_attach for PCI devices]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:48 +10:00
Thomas Huth a1aa130989 hw/ppc/spapr: Silence deprecation message in qtest mode
When running "make check", there is currently always an error message
saying "spapr-pci-vfio-host-bridge is deprecated". This happens because
the QOM tests are instantiating all possible devices, and the error
message is currently located in the instance_init() function of the
device. Since it is legal for the tests to instantiate a device without
using it, the error message should be silenced when we're running in
test mode.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 09:47:59 +10:00
Bharata B Rao d0e5a8f293 spapr: Ensure all LMBs are represented in ibm,dynamic-memory
Memory hotplug can fail for some combinations of RAM and maxmem when
DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends
on maximum addressable memory returned by guest and this value is currently
being calculated wrongly by the guest kernel routine memory_hotplug_max().
While there is an attempt to fix the guest kernel, this patch works
around the problem within QEMU itself.

memory_hotplug_max() routine in the guest kernel arrives at max
addressable memory by multiplying lmb-size with the lmb-count obtained
from ibm,dynamic-memory property. There are two assumptions here:

- All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM
  where only hot-pluggable LMBs are present in this property.
- The memory area comprising of RAM and hotplug region is contiguous: This
  needn't be true always for PowerKVM as there can be gap between
  boot time RAM and hotplug region.

To work around this guest kernel bug, ensure that ibm,dynamic-memory
has information about all the LMBs (RMA, boot-time LMBs, future
hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and
hotpluggable region).

RMA is represented separately by memory@0 node. Hence mark RMA LMBs
and also the LMBs for the gap b/n RAM and hotpluggable region as
reserved and as having no valid DRC so that these LMBs are not considered
by the guest.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14 13:20:01 +10:00
Thomas Huth b30ff227c2 ppc: Add PowerISA 2.07 compatibility mode
Make sure that guests can use the PowerISA 2.07 CPU sPAPR
compatibility mode when they request it and the target CPU
supports it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14 10:41:38 +10:00
Thomas Huth 8cd2ce7aaa ppc: Split pcr_mask settings into supported bits and the register mask
The current pcr_mask values are ambiguous: Should these be the mask
that defines valid bits in the PCR register? Or should these rather
indicate which compatibility levels are possible? Anyway, POWER6 and
POWER7 should certainly not use the same values here. So let's
introduce an additional variable "pcr_supported" here which is
used to indicate the valid compatibility levels, and use pcr_mask
to signal the valid bits in the PCR register.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14 10:41:38 +10:00
Thomas Huth 7386ae6372 ppc/spapr: Refactor h_client_architecture_support() CPU parsing code
The h_client_architecture_support() function has become quite big
and nested already. So factor out the code that takes care of the
sPAPR compatibility PVRs (which will be modified by the following
patches).

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14 10:41:37 +10:00
Eduardo Habkost 4bcbe0b636 vl: Eliminate usb_enabled()
This wrapper for machine_usb(current_machine) is not necessary,
replace all usages of usb_enabled() with machine_usb().

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: qemu-arm@nongnu.org
Cc: qemu-ppc@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1465419025-21519-3-git-send-email-ehabkost@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-13 13:24:41 +02:00
Laurent Vivier a2c5eaf7a9 ppc: Remove a potential overflow in muldiv64()
The coccinelle script:
scripts/coccinelle/overflow_muldiv64.cocci
gives us a list of potential overflows in muldiv64()
(the two first parameters are 64bit values).

This patch fixes one, as the fix seems obvious:

replace muldiv64(a, b, c) by muldiv64(b, a, c)
as "a" and "b" are 64bit values but a <= NANOSECONDS_PER_SECOND.
(10^9 -> 30bit value).

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:02:49 +03:00
Markus Armbruster 679dd415bb spapr_pci: Drop cannot_instantiate_with_device_add_yet=false
It's become redundant since it was added in commit 09aa9a5 "spapr-pci:
enable adding PHB via -device".

Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Bharata B Rao 1ea1eefcbb spapr: Introduce pseries-2.7 machine type
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Bharata B Rao 71c9a3dd04 spapr: Increase hotpluggable memory slots to 256
KVM now supports 512 memslots on PowerPC (earlier it was 32). Allow half
of it (256) to be used as hotpluggable memory slots.

Instead of hard coding the max value, use the KVM supplied value if KVM
is enabled. Otherwise resort to the default value of 32.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Alexey Kardashevskiy b3162f22cb spapr_pci: Add and export DMA resetting helper
This will be later used by the "ibm,reset-pe-dma-window" RTAS handler
which resets the DMA configuration to the defaults.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Alexey Kardashevskiy acf1b6dd22 spapr_pci: Reset DMA config on PHB reset
LoPAPR dictates that during system reset all DMA windows must be removed
and the default DMA32 window must be created so does the patch.

At the moment there is just one window supported so no change in
behaviour is expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Alexey Kardashevskiy b4b6eb771a spapr_iommu: Add root memory region
We are going to have multiple DMA windows at different offsets on
a PCI bus. For the sake of migration, we will have as many TCE table
objects pre-created as many windows supported.
So we need a way to map windows dynamically onto a PCI bus
when migration of a table is completed but at this stage a TCE table
object does not have access to a PHB to ask it to map a DMA window
backed by just migrated TCE table.

This adds a "root" memory region (UINT64_MAX long) to the TCE object.
This new region is mapped on a PCI bus with enabled overlapping as
there will be one root MR per TCE table, each of them mapped at 0.
The actual IOMMU memory region is a subregion of the root region and
a TCE table enables/disables this subregion and maps it at
the specific offset inside the root MR which is 1:1 mapping of
a PCI address space.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Alexey Kardashevskiy a26fdf3934 spapr_iommu: Migrate full state
The source guest could have reallocated the default TCE table and
migrate bigger/smaller table. This adds reallocation in post_load()
if the default table size is different on source and destination.

This adds @bus_offset, @page_shift to the migration stream as
a subsection so when DDW is added, migration to older machines will
still be possible. As @bus_offset and @page_shift are not used yet,
this makes no change in behavior.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Alexey Kardashevskiy df7625d422 spapr_iommu: Introduce "enabled" state for TCE table
Currently TCE tables are created once at start and their sizes never
change. We are going to change that by introducing a Dynamic DMA windows
support where DMA configuration may change during the guest execution.

This changes spapr_tce_new_table() to create an empty zero-size IOMMU
memory region (IOMMU MR). Only LIOBN is assigned by the time of creation.
It still will be called once at the owner object (VIO or PHB) creation.

This introduces an "enabled" state for TCE table objects, some
helper functions are added:
- spapr_tce_table_enable() receives TCE table parameters, stores in
sPAPRTCETable and allocates a guest view of the TCE table
(in the user space or KVM) and sets the correct size on the IOMMU MR;
- spapr_tce_table_disable() disposes the table and resets the IOMMU MR
size; it is made public as the following DDW code will be using it.

This changes the PHB reset handler to do the default DMA initialization
instead of spapr_phb_realize(). This does not make differenct now but
later with more than just one DMA window, we will have to remove them all
and create the default one on a system reset.

No visible change in behaviour is expected except the actual table
will be reallocated every reset. We might optimize this later.

The other way to implement this would be dynamically create/remove
the TCE table QOM objects but this would make migration impossible
as the migration code expects all QOM objects to exist at the receiver
so we have to have TCE table objects created when migration begins.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Benjamin Herrenschmidt cd0c6f4735 ppc: Do some batching of TCG tlb flushes
On ppc64 especially, we flush the tlb on any slbie or tlbie instruction.

However, those instructions often come in bursts of 3 or more (context
switch will favor a series of slbie's for example to an slbia if the
SLB has less than a certain number of entries in it, and tlbie's can
happen in a series, with PAPR, H_BULK_REMOVE can remove up to 4 entries
at a time.

Doing a tlb_flush() each time is a waste of time. We end up doing a memset
of the whole TLB, reloading it for the next instruction, memset'ing again,
etc...

Those instructions don't have to take effect immediately. For slbie, they
can wait for the next context synchronizing event. For tlbie, the next
tlbsync.

This implements batching by keeping a flag that indicates that we have a
TLB in need of flushing. We check it on interrupts, rfi's, isync's and
tlbsync and flush the TLB if needed.

This reduces the number of tlb_flush() on a boot to a ubuntu installer
first dialog screen from roughly 360K down to 36K.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: added a 'CPUPPCState *' variable in h_remove() and
      h_bulk_remove() ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: removed spurious whitespace change, use 0/1 not true/false
      consistently, since tlb_need_flush has int type]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 13:20:04 +10:00
Alexey Kardashevskiy fec5d3a1cd spapr_iommu: Move table allocation to helpers
At the moment presence of vfio-pci devices on a bus affect the way
the guest view table is allocated. If there is no vfio-pci on a PHB
and the host kernel supports KVM acceleration of H_PUT_TCE, a table
is allocated in KVM. However, if there is vfio-pci and we do yet not
KVM acceleration for these, the table has to be allocated by
the userspace. At the moment the table is allocated once at boot time
but next patches will reallocate it.

This moves kvmppc_create_spapr_tce/g_malloc0 and their counterparts
to helpers.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:23 +10:00
Alexey Kardashevskiy eded5bac3b spapr_pci: Use correct DMA LIOBN when composing the device tree
The user could have picked LIOBN via the CLI but the device tree
rendering code would still use the value derived from the PHB index
(which is the default fallback if LIOBN is not set in the CLI).

This replaces SPAPR_PCI_LIOBN() with the actual DMA LIOBN value.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:23 +10:00
Jianjun Duan 5dd5238c0b spapr: ensure device trees are always associated with DRC
There are possible racing situations involving hotplug events and
guest migration. For cases where a hotplug event is migrated, or
the guest is in the process of fetching device tree at the time of
migration, we need to ensure the device tree is created and
associated with the corresponding DRC for devices that were
hotplugged on the source, but 'coldplugged' on the target.

Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:23 +10:00
Zhou Jie 8afc22a20f Added negative check for get_image_size()
This patch adds check for negative return value from get_image_size(),
where it is missing. It avoids unnecessary two function calls.

Signed-off-by: Zhou Jie <zhoujie2011@cn.fujitsu.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:23 +10:00
Alexey Kardashevskiy d78c19b5cf memory: Fix IOMMU replay base address
Since a788f227 "memory: Allow replay of IOMMU mapping notifications"
when new VFIO listener is added, all existing IOMMU mappings are
replayed. However there is a problem that the base address of
an IOMMU memory region (IOMMU MR) is ignored which is not a problem
for the existing user (which is pseries) with its default 32bit DMA
window starting at 0 but it is if there is another DMA window.

This stores the IOMMU's offset_within_address_space and adjusts
the IOVA before calling vfio_dma_map/vfio_dma_unmap.

As the IOMMU notifier expects IOVA offset rather than the absolute
address, this also adjusts IOVA in sPAPR H_PUT_TCE handler before
calling notifier(s).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-05-26 11:12:08 -06:00
Igor Mammedov bacc344c54 machine: add properties to compat_props incrementaly
Switch to adding compat properties incrementaly instead of
completly overwriting compat_props per machine type.
That removes data duplication which we have due to nested
[PC|SPAPR]_COMPAT_* macros.

It also allows to set default device properties from
default foo_machine_options() hook, which will be used
in following patch for putting VMGENID device as
a function if ISA bridge on pc/q35 machines.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
[ehabkost: Fixed CCW_COMPAT_* and PC_COMPAT_0_* defines]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:54 -03:00
Paolo Bonzini 63c915526d cpu: move exec-all.h inclusion out of cpu.h
exec-all.h contains TCG-specific definitions.  It is not needed outside
TCG-specific files such as translate.c, exec.c or *helper.c.

One generic function had snuck into include/exec/exec-all.h; move it to
include/qom/cpu.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini 03dd024ff5 hw: explicitly include qemu/log.h
Move the inclusion out of hw/hw.h, most files do not need it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini 33c11879fd qemu-common: push cpu.h inclusion out of qemu-common.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini 77ac58ddc6 dma: do not depend on kvm_enabled()
Memory barriers are needed also by Xen and, when the ioeventfd
bugs are fixed, by TCG as well.

sysemu/kvm.h is not anymore needed in sysemu/dma.h, move it to
the actual users.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini cbd62f8616 hw: do not use VMSTATE_*TL
Reserve this to CPU state serialization.

Luckily, they were only used by sPAPR devices and these are ppc64
only.  So there is no change to migration format.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini aa5a9e2484 ppc: use PowerPCCPU instead of CPUPPCState
This changes a cpu.h dependency for hw/ppc/ppc.h into a cpu-qom.h
dependency.  For it to compile we also need to clean up a few unused
definitions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:27 +02:00
Eric Blake d9f62dde13 qapi: Simplify semantics of visit_next_list()
The semantics of the list visit are somewhat baroque, with the
following pseudocode when FooList is used:

start()
for (prev = head; cur = next(prev); prev = &cur) {
    visit(&cur->value)
}

Note that these semantics (advance before visit) requires that
the first call to next() return the list head, while all other
calls return the next element of the list; that is, every visitor
implementation is required to track extra state to decide whether
to return the input as-is, or to advance.  It also requires an
argument of 'GenericList **' to next(), solely because the first
iteration might need to modify the caller's GenericList head, so
that all other calls have to do a layer of dereferencing.

Thankfully, we only have two uses of list visits in the entire
code base: one in spapr_drc (which completely avoids
visit_next_list(), feeding in integers from a different source
than uint8List), and one in qapi-visit.py.  That is, all other
list visitors are generated in qapi-visit.c, and share the same
paradigm based on a qapi FooList type, so we can refactor how
lists are laid out with minimal churn among clients.

We can greatly simplify things by hoisting the special case
into the start() routine, and flipping the order in the loop
to visit before advance:

start(head)
for (tail = *head; tail; tail = next(tail)) {
    visit(&tail->value)
}

With the simpler semantics, visitors have less state to track,
the argument to next() is reduced to 'GenericList *', and it
also becomes obvious whether an input visitor is allocating a
FooList during visit_start_list() (rather than the old way of
not knowing if an allocation happened until the first
visit_next_list()).  As a minor drawback, we now allocate in
two functions instead of one, and have to pass the size to
both functions (unless we were to tweak the input visitors to
cache the size to start_list for reuse during next_list, but
that defeats the goal of less visitor state).

The signature of visit_start_list() is chosen to match
visit_start_struct(), with the new parameters after 'name'.

The spapr_drc case is a virtual visit, done by passing NULL for
list, similarly to how NULL is passed to visit_start_struct()
when a qapi type is not used in those visits.  It was easy to
provide these semantics for qmp-output and dealloc visitors,
and a bit harder for qmp-input (several prerequisite patches
refactored things to make this patch straightforward).  But it
turned out that the string and opts visitors munge enough other
state during visit_next_list() to make it easier to just
document and require a GenericList visit for now; an assertion
will remind us to adjust things if we need the semantics in the
future.

Several pre-requisite cleanup patches made the reshuffling of
the various visitors easier; particularly the qmp input visitor.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-24-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake 15c2f669e3 qapi: Split visit_end_struct() into pieces
As mentioned in previous patches, we want to call visit_end_struct()
functions unconditionally, so that visitors can release resources
tied up since the matching visit_start_struct() without also having
to worry about error priority if more than one error occurs.

Even though error_propagate() can be safely used to ignore a second
error during cleanup caused by a first error, it is simpler if the
cleanup cannot set an error.  So, split out the error checking
portion (basically, input visitors checking for unvisited keys) into
a new function visit_check_struct(), which can be safely skipped if
any earlier errors are encountered, and leave the cleanup portion
(which never fails, but must be called unconditionally if
visit_start_struct() succeeded) in visit_end_struct().

Generated code in qapi-visit.c has diffs resembling:

|@@ -59,10 +59,12 @@ void visit_type_ACPIOSTInfo(Visitor *v,
|         goto out_obj;
|     }
|     visit_type_ACPIOSTInfo_members(v, obj, &err);
|-    error_propagate(errp, err);
|-    err = NULL;
|+    if (err) {
|+        goto out_obj;
|+    }
|+    visit_check_struct(v, &err);
| out_obj:
|-    visit_end_struct(v, &err);
|+    visit_end_struct(v);
| out:

and in qapi-event.c:

@@ -47,7 +47,10 @@ void qapi_event_send_acpi_device_ost(ACP
|         goto out;
|     }
|     visit_type_q_obj_ACPI_DEVICE_OST_arg_members(v, &param, &err);
|-    visit_end_struct(v, err ? NULL : &err);
|+    if (!err) {
|+        visit_check_struct(v, &err);
|+    }
|+    visit_end_struct(v);
|     if (err) {
|         goto out;

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-20-git-send-email-eblake@redhat.com>
[Conflict with a doc fixup resolved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake a543a554cf spapr_drc: Expose 'null' in qom-get when there is no fdt
Now that the QMP output visitor supports an explicit null
output, we should utilize it to make it easier to diagnose
the difference between a missing fdt ('null') vs. a
present-but-empty one ('{}').

(Note that this reverts the behavior of commit ab8bf1d, taking
us back to the behavior of commit 6c2f9a1 [which in turn
stemmed from a crash fix in 1d10b44]; but that this time,
the change is intentional and not an accidental side-effect.)

Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1461879932-9020-17-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Michael Roth df18b2db69 spapr_drc: fix aborts during DRC-count based hotplug
CPU/memory resources can be signalled en-masse via
spapr_hotplug_req_add_by_count(), and when doing so, actually change
the meaning of the 'drc' parameter passed to
spapr_hotplug_req_event() to be a count rather than an index.

f40eb92 added a hook in spapr_hotplug_req_event() to record when a
device had been 'signalled' to the guest, but that code assumes that
drc is always an index. In cases where it's a count, such as memory
hotplug, the DRC lookup will fail, leading to an assert.

Fix this by only explicitly setting the signalled state for cases where
we are doing PCI hotplug.

For other resources types, since we cannot selectively track whether a
resource has been signalled in cases where we signal attach as a count,
set the 'signalled' state to true immediately upon making the
resource available via drck->attach().

Reported-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: david@gibson.dropbear.id.au
Cc: qemu-ppc@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-04-26 11:16:08 +10:00
Thomas Huth da34fed707 hw/ppc/spapr: Fix crash when specifying bad parameters to spapr-pci-host-bridge
QEMU currently crashes when using bad parameters for the
spapr-pci-host-bridge device:

$ qemu-system-ppc64 -device spapr-pci-host-bridge,buid=0x123,liobn=0x321,mem_win_addr=0x1,io_win_addr=0x10
Segmentation fault

The problem is that spapr_tce_find_by_liobn() might return NULL, but
the code in spapr_populate_pci_dt() does not check for this condition
and then tries to dereference this NULL pointer.
Apart from that, the return value of spapr_populate_pci_dt() also
has to be checked for all PCI buses, not only for the last one, to
make sure we catch all errors.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-04-23 16:52:20 +10:00
Peter Maydell 3be4f4d724 ppc patch queue for 2016-04-08
Just a single bugfix for spapr in this batch, but I want to make sure
 it gets in for 2.6.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJXBzt1AAoJEGw4ysog2bOSyGQQAIL4aADwOhNoVLjtvBN3eoPQ
 cP+Ps3DCK/9Z9l00cMR6/8zk5Q2Nb1FLf2Y3f3c2JVEFER8XCnsYPIyYfOZaMex4
 /8DUfVueTh0RmpxhWwA4vQJtDqrilB0tUkkqgWFPE2luJcTVTUU7mig788d2yrmp
 J35ncNaMcrXGy0Uh/wBlnOpfHD17ds8Sgpw02TT9QusqIjq8MWIkgat0v+h4RmRL
 lzEE5N1Vp8vOvJENTEnuuKFbFTxcvhBS+A2K1y+s10k7c1CuFFJpAZY7g3T4hpqU
 NZAirty5WeMlSYk9A0gQhgHWq2XSgbDWWj6tMGd5sCEQH5D6Kty0TPWnCpzSxjgu
 aqGr7BqAV+NV/Rr/jGy4gvE432f1pZWUIxq271OH9H5aniCWSYFBR7w4UEaM1BPQ
 I5tzkp7P1PMWIm/K5ryFVo083kU08KFXZDSbQR/vu4O+DuohPUKYid5cv4wJj/W+
 GSzBwTwtp8iY2rs/nbMptSYHKYFYtd5PuALf4BoK62sF72NtWq+41X3QV8I4cIQd
 hM03NyuObgnY7aygPmo9OGsvW/Dx8DKKoEO0QX+2gFa22rJ+j7RLSu7pHFW1JEXa
 5VkVlTtN8L5NeeG0PdkgkChcgiqahUA6bRjekpFzdoncfsmmiPkiP5xQqK1DVKhW
 SoJacddcj86QGpT1aioU
 =4ZAr
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160408' into staging

ppc patch queue for 2016-04-08

Just a single bugfix for spapr in this batch, but I want to make sure
it gets in for 2.6.

# gpg: Signature made Fri 08 Apr 2016 06:02:45 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.6-20160408:
  spapr: Fix ibm,lrdr-capacity

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-04-08 11:54:19 +01:00
Bharata B Rao a110655a06 spapr: Fix ibm,lrdr-capacity
ibm,lrdr-capacity has a field to describe the maximum address in bytes
and therefore, the most memory that can be allocated to this guest. We
are using maxmem for this field, but instead should use the actual RAM
address corresponding to the end of hotplug region.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-04-08 11:18:10 +10:00
Gonglei 1a5512bb7e spapr: fix possible Negative array index read
fix CID 1351391.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1456998223-12356-6-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-04-08 00:07:56 +02:00
Michael Roth f40eb921da spapr_drc: enable immediate detach for unsignalled devices
Currently spapr doesn't support "aborting" hotplug of PCI
devices by allowing device_del to immediately remove the
device if we haven't signalled the presence of the device
to the guest.

In the past this wasn't an issue, since we always immediately
signalled device attach and simply relied on full guest-aware
add->remove path for device removal. However, as of 788d259,
we now defer signalling for PCI functions until function 0
is attached, so now we need to deal with these "abort" operations
for cases where a user hotplugs a non-0 function, then opts to
remove it prior hotplugging function 0. Currently they'd have to
reboot before the unplug completed. PCIe multifunction hotplug
does not have this requirement however, so from a management
implementation perspective it would be good to address this within
the same release as 788d259.

We accomplish this by simply adding a 'signalled' flag to track
whether a device hotplug event has been sent to the guest. If it
hasn't, we allow immediate removal under the assumption that the
guest will not be using the device. Devices present at boot/reset
time are also assumed to be 'signalled'.

For CPU/memory/etc, signalling will still happen immediately
as part of device_add, so only PCI functions should be affected.

Cc: bharata@linux.vnet.ibm.com
Cc: david@gibson.dropbear.id.au
Cc: sbhat@linux.vnet.ibm.com
Cc: qemu-ppc@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[dwg: This fixes a regression where an incorrect hot-add of a non-zero
      function can no longer be backed out until function 0 is added]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-04-05 10:47:03 +10:00
Cédric Le Goater 5c94b2a5e5 ppc: Rework POWER7 & POWER8 exception model
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>

This patch fixes the current AIL implementation for POWER8. The
interrupt vector address can be calculated directly from LPCR when the
exception is handled. The excp_prefix update becomes useless and we
can cleanup the H_SET_MODE hcall.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: Removed LPES0/1 handling for HV vs. !HV
      Fixed LPCR_ILE case for POWERPC_EXCP_POWER8 ]
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[dwg: This was written as a cleanup, but it also fixes a real bug
      where setting an alternative interrupt location would not be
      correctly migrated]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-04-05 10:38:24 +10:00
Peter Maydell 84a5a80148 * Log filtering from Alex and Peter
* Chardev fix from Marc-André
 * config.status tweak from David
 * Header file tweaks from Markus, myself and Veronia (Outreachy candidate)
 * get_ticks_per_sec() removal from Rutuja (Outreachy candidate)
 * Coverity fix from myself
 * PKE implementation from myself, based on rth's XSAVE support
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQEcBAABCAAGBQJW9ErPAAoJEL/70l94x66DJfEH/A/QkMpAhrgNdyVsahzsGrzE
 wx5gHFIc1nBYxyr62w4apUb5jPB7zaXu0LA7EAWDeAe0pyP8hZzLT9kJyOEDsuJu
 zwKN2QeLSNMtPbnbKN0I/YQ2za2xX1V5ruhSeOJoVslUI214hgnAURaGshhQNzuZ
 2CluDT9KgL5cQifAnKs5kJrwhIYShYNQB+1eDC/7wk28dd/EH+sPALIoF+rqrSmt
 Zu4Mdqd+9Ns+oKOjA6br9ULq/Hzg0aDfY82J+XLVVqfF3PXQe8rTDmuMf/7jTn+M
 Un7ZOcei9oZF2/9vfAfKQpDCcgD9HvOUSbgqV/ubmkPPmN/LNJzeKj0fBhrRN+Y=
 =K12D
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging

* Log filtering from Alex and Peter
* Chardev fix from Marc-André
* config.status tweak from David
* Header file tweaks from Markus, myself and Veronia (Outreachy candidate)
* get_ticks_per_sec() removal from Rutuja (Outreachy candidate)
* Coverity fix from myself
* PKE implementation from myself, based on rth's XSAVE support

# gpg: Signature made Thu 24 Mar 2016 20:15:11 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream: (28 commits)
  target-i386: implement PKE for TCG
  config.status: Pass extra parameters
  char: translate from QIOChannel error to errno
  exec: fix error handling in file_ram_alloc
  cputlb: modernise the debug support
  qemu-log: support simple pid substitution for logs
  target-arm: dfilter support for in_asm
  qemu-log: dfilter-ise exec, out_asm, op and opt_op
  qemu-log: new option -dfilter to limit output
  qemu-log: Improve the "exec" TB execution logging
  qemu-log: Avoid function call for disabled qemu_log_mask logging
  qemu-log: correct help text for -d cpu
  tcg: pass down TranslationBlock to tcg_code_gen
  util: move declarations out of qemu-common.h
  Replaced get_tick_per_sec() by NANOSECONDS_PER_SECOND
  hw: explicitly include qemu-common.h and cpu.h
  include/crypto: Include qapi-types.h or qemu/bswap.h instead of qemu-common.h
  isa: Move DMA_transfer_handler from qemu-common.h to hw/isa/isa.h
  Move ParallelIOArg from qemu-common.h to sysemu/char.h
  Move QEMU_ALIGN_*() from qemu-common.h to qemu/osdep.h
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Conflicts:
	scripts/clean-includes
2016-03-24 21:42:40 +00:00
Thomas Huth 57c522f47b hw/net/spapr_llan: Enable the RX buffer pools by default for new machines
RX buffer pools are now enabled by default for new machine types.
For older machine types, they are still disabled to avoid breaking
migration.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-24 11:17:34 +11:00
Benjamin Herrenschmidt 26a7f1291b ppc: Create cpu_ppc_set_papr() helper
And move the code adjusting the MSR mask and calling kvmppc_set_papr()
to it. This allows us to add a few more things such as disabling setting
of MSR:HV and appropriate LPCR bits which will be used when fixing
the exception model.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[clg: removed LPCR setting ]
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-24 11:17:34 +11:00
Alexey Kardashevskiy 0ddbd05362 spapr/target-ppc/kvm: Only add hcall-instructions if KVM supports it
ePAPR defines "hcall-instructions" device-tree property which contains
code to call hypercalls in ePAPR paravirtualized guests.  In general
pseries guests won't use this property, instead using the PAPR defined
hypercall interface.

However, this property has been re-used to implement a hack to allow
PR KVM to run (slightly modified) guests in some situations where it
otherwise wouldn't be able to (because the system's L0 hypervisor
doesn't forward the PAPR hypercalls to the PR KVM kernel).

Hence, this property is always present in the device tree for pseries
guests. All KVM guests use it at least to read features via the
KVM_HC_FEATURES hypercall.

The property is populated by the code returned from the KVM's
KVM_PPC_GET_PVINFO ioctl; if not implemented in the KVM, QEMU supplies
code which will fail all hypercall attempts. If QEMU does not create
the property, and the guest kernel is compiled with
CONFIG_EPAPR_PARAVIRT (which is normally the case), there is exactly
the same stub at @epapr_hypercall_start already.

Rather than maintaining this fairly useless stub implementation, it
makes more sense not to create the property in the device tree in the
first place if the host kernel does not implement it.

This changes kvmppc_get_hypercall() to return 1 if the host kernel
does not implement KVM_CAP_PPC_GET_PVINFO. The caller can use it to decide
on whether to create the property or not.

This changes the pseries machine to not create the property if KVM does
not implement KVM_PPC_GET_PVINFO. In practice this means that from now
on the property will not be created if either HV KVM or TCG is used.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[reworded commit message for clarity --dwg]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-24 11:17:33 +11:00
Veronia Bahaa f348b6d1a5 util: move declarations out of qemu-common.h
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)

Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:17 +01:00
Rutuja Shah 73bcb24d93 Replaced get_tick_per_sec() by NANOSECONDS_PER_SECOND
This patch replaces get_ticks_per_sec() calls with the macro
NANOSECONDS_PER_SECOND. Also, as there are no callers, get_ticks_per_sec()
is then removed.  This replacement improves the readability and
understandability of code.

For example,

    timer_mod(fdctrl->result_timer,
	      qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + (get_ticks_per_sec() / 50));

NANOSECONDS_PER_SECOND makes it obvious that qemu_clock_get_ns
matches the unit of the expression on the right side of the plus.

Signed-off-by: Rutuja Shah <rutu.shah.26@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:17 +01:00
Paolo Bonzini 4771d756f4 hw: explicitly include qemu-common.h and cpu.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:17 +01:00
Markus Armbruster da34e65cb4 include/qemu/osdep.h: Don't include qapi/error.h
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef.  Since then, we've moved to include qemu/osdep.h
everywhere.  Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h.  That's in excess of
100KiB of crap most .c files don't actually need.

Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h.  Include qapi/error.h in .c files that need it and don't
get it now.  Include qapi-types.h in qom/object.h for uint16List.

Update scripts/clean-includes accordingly.  Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h.  Update the list of includes in the qemu/osdep.h
comment quoted above similarly.

This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third.  Unfortunately, the number depending on
qapi-types.h shrinks only a little.  More work is needed for that one.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-22 22:20:15 +01:00
Eduardo Habkost 0e6aac87fd machine: Use type_init() to register machine classes
Change all machine_init() users that simply call type_register*()
to use type_init().

Cc: Evgeny Voevodin <e.voevodin@samsung.com>
Cc: Maksim Kozlov <m.kozlov@samsung.com>
Cc: Igor Mitsyanko <i.mitsyanko@gmail.com>
Cc: Dmitry Solodkiy <d.solodkiy@samsung.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Michael Walle <michael@walle.cc>
Cc: "Hervé Poussineau" <hpoussin@reactos.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-03-16 15:34:05 -03:00
David Gibson a36304fdca spapr_pci: Remove finish_realize hook
Now that spapr-pci-vfio-host-bridge is reduced to just a stub, there is
only one implementation of the finish_realize hook in sPAPRPHBClass.  So,
we can fold that implementation into its (single) caller, and remove the
hook.  That's the last thing left in sPAPRPHBClass, so that can go away as
well.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:11 +11:00
David Gibson 72700d7e73 spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge
Now that the regular spapr-pci-host-bridge can handle EEH, there are only
two things that spapr-pci-vfio-host-bridge does differently:
    1. automatically sizes its DMA window to match the host IOMMU
    2. checks if the attached VFIO container is backed by the
       VFIO_SPAPR_TCE_IOMMU type on the host

(1) is not particularly useful, since the default window used by the
regular host bridge will work with the host IOMMU configuration on all
current systems anyway.

Plus, automatically changing guest visible configuration (such as the DMA
window) based on host settings is generally a bad idea.  It's not
definitively broken, since spapr-pci-vfio-host-bridge is only supposed to
support VFIO devices which can't be migrated anyway, but still.

(2) is not really useful, because if a guest tries to configure EEH on a
different host IOMMU, the first call will fail and that will be that.

It's possible there are scripts or tools out there which expect
spapr-pci-vfio-host-bridge, so we don't remove it entirely.  This patch
reduces it to just a stub for backwards compatibility.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:11 +11:00
David Gibson c1fa017c7e spapr_pci: Allow EEH on spapr-pci-host-bridge
Now that the EEH code is independent of the special
spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI
host bridges instead.  We do this by changing spapr_phb_eeh_available()
to be based on the vfio_eeh_as_ok() call instead of the host bridge class.

Because the value of vfio_eeh_as_ok() can change with devices being
hotplugged or unplugged, this can potentially lead to some strange edge
cases where the guest starts using EEH, then it starts failing because
of a change in status.

However, it's not really any worse than the current situation.  Cases that
would have worked previously will still work (i.e. VFIO devices from at
most one VFIO IOMMU group per vPHB), it's just that it's no longer
necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:11 +11:00
David Gibson fbb4e98341 spapr_pci: Eliminate class callbacks
The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the
special groupid field in sPAPRPHBVFIOState.  So we can simplify, removing
the class specific callbacks with direct calls based on a simple
spapr_phb_eeh_enabled() helper.  For now we implement that in terms of
a boolean in the class, but we'll continue to clean that up later.

On its own this is a rather strange way of doing things, but it's a useful
intermediate step to further cleanups.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:10 +11:00
David Gibson 76a9e9f680 spapr_pci: Switch to vfio_eeh_as_op() interface
This switches all EEH on VFIO operations in spapr_pci_vfio.c from the
broken vfio_container_ioctl() interface to the new vfio_as_eeh_op()
interface.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:10 +11:00
Greg Kurz f1a6cf3ef7 spapr_rng: fix race with main loop
Since commit "60253ed1e6ec rng: add request queue support to rng-random",
the use of a spapr_rng device may hang vCPU threads.

The following path is taken without holding the lock to the main loop mutex:

h_random()
  rng_backend_request_entropy()
    rng_random_request_entropy()
      qemu_set_fd_handler()

The consequence is that entropy_available() may be called before the vCPU
thread could even queue the request: depending on the scheduling, it may
happen that entropy_available() does not call random_recv()->qemu_sem_post().
The vCPU thread will then sleep forever in h_random()->qemu_sem_wait().

This could not happen before 60253ed1e6 because entropy_available() used
to call random_recv() unconditionally.

This patch ensures the lock is held to avoid the race.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@fr.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-16 09:55:06 +11:00
David Gibson c18ad9a54b target-ppc: Eliminate kvmppc_kern_htab global
fa48b43 "target-ppc: Remove hack for ppc_hash64_load_hpte*() with HV KVM"
purports to remove a hack in the handling of hash page tables (HPTs)
managed by KVM instead of qemu.  However, it actually went in the wrong
direction.

That patch requires anything looking for an external HPT (that is one not
managed by the guest itself) to check both env->external_htab (for a qemu
managed HPT) and kvmppc_kern_htab (for a KVM managed HPT).  That's a
problem because kvmppc_kern_htab is local to mmu-hash64.c, but some places
which need to check for an external HPT are outside that, such as
kvm_arch_get_registers().  The latter was subtly broken by the earlier
patch such that gdbstub can no longer access memory.

Basically a KVM managed HPT is much more like a qemu managed HPT than it is
like a guest managed HPT, so the original "hack" was actually on the right
track.

This partially reverts fa48b43, so we again mark a KVM managed external HPT
by putting a special but non-NULL value in env->external_htab.  It then
goes further, using that marker to eliminate the kvmppc_kern_htab global
entirely.  The ppc_hash64_set_external_hpt() helper function is extended
to set that marker if passed a NULL value (if you're setting an external
HPT, but don't have an actual HPT to set, the assumption is that it must
be a KVM managed HPT).

This also has some flow-on changes to the HPT access helpers, required by
the above changes.

Reported-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
2016-03-16 09:55:06 +11:00
David Gibson e5c0d3ce40 target-ppc: Add helpers for updating a CPU's SDR1 and external HPT
When a Power cpu with 64-bit hash MMU has it's hash page table (HPT)
pointer updated by a write to the SDR1 register we need to update some
derived variables.  Likewise, when the cpu is configured for an external
HPT (one not in the guest memory space) some derived variables need to be
updated.

Currently the logic for this is (partially) duplicated in ppc_store_sdr1()
and in spapr_cpu_reset().  In future we're going to need it in some other
places, so make some common helpers for this update.

In addition the new ppc_hash64_set_external_hpt() helper also updates
SDR1 in KVM - it's not updated by the normal runtime KVM <-> qemu CPU
synchronization.  In a sense this belongs logically in the
ppc_hash64_set_sdr1() helper, but that is called from
kvm_arch_get_registers() so can't itself call cpu_synchronize_state()
without infinite recursion.  In practice this doesn't matter because
the only other caller is TCG specific.

Currently there aren't situations where updating SDR1 at runtime in KVM
matters, but there are going to be in future.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-03-16 09:55:06 +11:00
Michael Roth 788d2599de spapr_pci: fix multifunction hotplug
Since 3f1e147, QEMU has adopted a convention of supporting function
hotplug by deferring hotplug events until func 0 is hotplugged.
This is likely how management tools like libvirt would expose
such support going forward.

Since sPAPR guests rely on per-func events rather than
slot-based, our protocol has been to hotplug func 0 *first* to
avoid cases where devices appear within guests without func 0
present to avoid undefined behavior.

To remain compatible with new convention, defer hotplug in a
similar manner, but then generate events in 0-first order as we
did in the past. Once func 0 present, fail any attempts to plug
additional functions (as we do with PCIe).

For unplug, defer unplug operations in a similar manner, but
generate unplug events such that function 0 is removed last in guest.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-16 09:55:05 +11:00
Michael S. Tsirkin 226419d615 msi_supported -> msi_nonbroken
Rename controller flag to make it clearer what it means.
Add some documentation as well.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:45:21 +02:00
Peter Crosthwaite 7ef295ea5b loader: Add data swap option to load-elf
Some CPUs are of an opposite data-endianness to other components in the
system. Sometimes elfs have the data sections layed out with this CPU
data-endianness accounting for when loaded via the CPU, so byte swaps
(relative to other system components) will occur.

The leading example, is ARM's BE32 mode, which is is basically LE with
address manipulation on half-word and byte accesses to access the
hw/byte reversed address. This means that word data is invariant
across LE and BE32. This also means that instructions are still LE.
The expectation is that the elf will be loaded via the CPU in this
endianness scheme, which means the data in the elf is reversed at
compile time.

As QEMU loads via the system memory directly, rather than the CPU, we
need a mechanism to reverse elf data endianness to implement this
possibility.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:21 +00:00
Greg Kurz a005b3ef50 xics: report errors with the QEMU Error API
Using the return value to report errors is error prone:
- xics_alloc() returns -1 on error but spapr_vio_busdev_realize() errors
  on 0
- xics_alloc_block() returns the unclear value of ics->offset - 1 on error
  but both rtas_ibm_change_msi() and spapr_phb_realize() error on 0

This patch adds an errp argument to xics_alloc() and xics_alloc_block() to
report errors. The return value of these functions is a valid IRQ number
if errp is NULL. It is undefined otherwise.

The corresponding error traces get promotted to error messages. Note that
the "can't allocate IRQ" error message in spapr_vio_busdev_realize() also
moves to xics_alloc(). Similar error message consolidation isn't really
applicable to xics_alloc_block() because callers have extra context (device
config address, MSI or MSIX).

This fixes the issues mentioned above.

Based on previous work from Brian W. Hart.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Greg Kurz 09b5e30da5 spapr: skip configuration section during migration of older machines
Since QEMU 2.4, we have a configuration section in the migration stream.
This must be skipped for older machines, like it is already done for x86.

This patch fixes the migration of pseries-2.3 from/to QEMU 2.3, but it
breaks migration of the same machine from/to QEMU 2.4/2.4.1/2.5. We do
that anyway because QEMU 2.3 is likely to be more widely deployed than
newer QEMU versions.

Fixes: 61964c23e5
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Greg Kurz cba0e7796b spapr: disable vmdesc submission for old machines
Since QEMU 2.3, we have a vmdesc section in the migration stream.
This section is not mandatory but when migrating a pseries-2.2
machine from QEMU 2.2, you get a warning at the destination:

qemu-system-ppc64: Expected vmdescription section, but got 0

The warning goes away if we decide to skip vmdesc as well for
older pseries, like it is already done for pc's.

This can only be observed with -cpu POWER7 because POWER8
cannot migrate from QEMU 2.2 to 2.3 (insns_flags2 mismatch).

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Greg Kurz ce266b75fe spapr_pci: fix irq leak in RTAS ibm,change-msi
This RTAS call is used to request new interrupts or to free all interrupts.

If the driver has already allocated interrupts and asks again for a non-null
number of irqs, then the rtas_ibm_change_msi() function will silently leak
the previous interrupts.

It happens because xics_free() is only called when the driver releases all
interrupts (!req_num case). Note that the previously allocated spapr_pci_msi
is not leaked because the GHashTable is created with destroy functions and
g_hash_table_insert() hence frees the old value.

This patch makes sure any previously allocated MSIs are released when a
new allocation succeeds.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Greg Kurz d4a63ac8b1 spapr_pci: kill useless variable in rtas_ibm_change_msi()
The num local variable is initialized to zero and has no writer.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Greg Kurz 3d0db3e74d spapr_rng: disable hotpluggability
It is currently possible to hotplug a spapr_rng device but QEMU crashes
when we try to hot unplug:

ERROR:hw/core/qdev.c:295:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted

This happens because spapr_rng isn't plugged to any bus and sPAPR does
not provide hotplug support for it: qdev_get_hotplug_handler() hence
return NULL and we hit the assertion.

And anyway, it doesn't make much sense to unplug this device since hcalls
cannot be unregistered. Even the idea of hotplugging a RNG device instead
of declaring it on the QEMU command line looks weird.

This patch simply disables hotpluggability for the spapr-rng class.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Greg Kurz 9897e46264 spapr: initialize local Error pointer
This fixes a crash in the target QEMU during migration.

Broken in commit c5f54f3.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[reworded commit message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-25 13:58:44 +11:00
Thomas Huth 3240dd9a69 hw/ppc/spapr: Implement the h_page_init hypercall
This hypercall either initializes a page with zeros, or copies
another page.
According to LoPAPR, the i-cache of the page should also be
flushed if using H_ICACHE_INVALIDATE or H_ICACHE_SYNCHRONIZE,
and the d-cache should be synchronized to the RAM if the
H_ICACHE_SYNCHRONIZE flag is used. For this, two new functions
are introduced, kvmppc_dcbst_range() and kvmppc_icbi()_range, which
use the corresponding assembler instructions to flush the caches
if running with KVM on Power. If the code runs with TCG instead,
the code only uses tb_flush(), assuming that this will be
enough for synchronization.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-25 13:58:44 +11:00
Thomas Huth 8a9c1b77e9 hw/ppc/spapr: Halt CPU when powering off via RTAS call
The LoPAPR specification defines the following for the RTAS
power-off call: "On successful operation, does not return".
However, the implementation in QEMU currently returns and runs
the guest CPU again for some more cycles. This caused some
trouble with the new ppc implementation of the kvm-unit-tests
recently. So let's make sure that the QEMU implementation
follows the spec, thus stop the CPU to make sure that the
RTAS call does not return to the guest anymore.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-18 11:08:43 +11:00
David Gibson 1c81003acc pseries: Include missing pseries-2.5 compat properties in pseries-2.4
Commit 4b23699 "pseries: Add pseries-2.6 machine type" added a new
SPAPR_COMPAT_2_5 macro in the usual way.  However, it didn't add this
macro to the existing SPAPR_COMPAT_2_4 macro so that pseries-2.4
inherits newer compatibility properties which are needed for 2.5 and
earlier.

This corrects the oversight.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-02-17 10:25:37 +11:00
Hervé Poussineau 216c906e62 cuda: port SET_DEVICE_LIST command to new framework
Also implement the command, by taking device list mask into account
when polling ADB devices.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Hervé Poussineau 374312e7c5 cuda: port SET_AUTO_RATE command to new framework
Also implement the command, by removing the hardcoded period of 20 ms/50 Hz
and replacing it by the one requested by user.
Update VMState version to store this new parameter.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Thomas Huth e49ff266f8 hw/ppc/spapr: Implement the h_set_xdabr hypercall
The H_SET_XDABR hypercall is similar to H_SET_DABR, but also sets
the extended DABR (DABRX) register.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Thomas Huth af08a58f0c hw/ppc/spapr: Implement h_set_dabr
According to LoPAPR, h_set_dabr should simply set DABRX to 3
(if the register is available), and load the parameter into DABR.
If DABRX is not available, the hypervisor has to check the
"Breakpoint Translation" bit of the DABR register first.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Thomas Huth 423576f771 hw/ppc/spapr: Add h_set_sprg0 hypercall
This is a very simple hypercall that only sets up the SPRG0
register for the guest (since writing to SPRG0 was only permitted
to the hypervisor in older versions of the PowerISA).

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
David Gibson 378bc21756 migration: ensure htab_save_first completes after timeout
htab_save_first_pass could return without finishing its work due to
timeout. The patch checks if another invocation of it is necessary and
will call it in htab_save_complete if necessary.

Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[removed overlong line]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
David Gibson fa48b4328c target-ppc: Remove hack for ppc_hash64_load_hpte*() with HV KVM
With HV KVM, the guest's hash page table (HPT) is managed by the kernel and
not directly accessible to QEMU.  This means that spapr->htab is NULL
and normally env->external_htab would also be NULL for each cpu.

However, that would cause ppc_hash64_load_hpte*() to do the wrong thing in
the few cases where QEMU does need to load entries from the in-kernel HPT.
Specifically, seeing external_htab is NULL, they would look for an HPT
within the guest's address space instead.

To stop that we have an ugly hack in the pseries machine type code to
set external htab to (void *)1 instead.

This patch removes that hack by having ppc_hash64_load_hpte*() explicitly
check kvmppc_kern_htab instead, which makes more sense.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-02-17 09:59:30 +11:00
David Gibson c5f54f3e31 pseries: Move hash page table allocation to reset time
At the moment the size of the hash page table (HPT) is fixed based on the
maximum memory allowed to the guest.  As such, we allocate the table during
machine construction, and just clear it at reset.

However, we're planning to implement a PAPR extension allowing the hash
page table to be resized at runtime.  This will mean that on reset we want
to revert it to the default size.  It also means that when migrating, we
need to make sure the destination allocates an HPT of size matching the
host, since the guest could have changed it before the migration.

This patch replaces the spapr_alloc_htab() and spapr_reset_htab() functions
with a new spapr_reallocate_hpt() function.  This is called at reset and
inbound migration only, not during machine init any more.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-02-17 09:59:30 +11:00
David Gibson 8dfe8e7f4f pseries: Add helper to calculate recommended hash page table size
At present we calculate the recommended hash page table (HPT) size for a
pseries guest just once in ppc_spapr_init() before allocating the HPT.
In future patches we're going to want this calculation in other places, so
this splits it out into a helper function.  While we're at it, change the
calculation to use ctz() instead of an explicit loop.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-02-17 09:59:30 +11:00
David Gibson 715c54071a pseries: Simplify handling of the hash page table fd
When migrating the 'pseries' machine type with KVM, we use a special fd
to access the hash page table stored within KVM.  Usually, this fd is
opened at the beginning of migration, and kept open until the migration
is complete.

However, if there is a guest reset during the migration, the fd can become
stale and we need to re-open it.  At the moment we use an 'htab_fd_stale'
flag in sPAPRMachineState to signal this, which is checked in the migration
iterators.

But that's rather ugly.  It's simpler to just close and invalidate the
fd on reset, and lazily re-open it in migration if necessary.  This patch
implements that change.

This requires a small addition to the machine state's instance_init,
so that htab_fd is initialized to -1 (telling the migration code it
needs to open it) instead of 0, which could be a valid fd.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-02-17 09:59:30 +11:00
Eric Blake 08f9541dec qapi: Drop unused error argument for list and implicit struct
No backend was setting an error when ending the visit of a list or
implicit struct, or when moving to the next list node.  Make the
callers a bit easier to follow by making this a part of the contract,
and removing the errp argument - callers can then unconditionally end
an object as part of cleanup without having to think about whether a
second error is dominated by a first, because there is no second
error.

A later patch will then tackle the larger task of splitting
visit_end_struct(), which can indeed set an error.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-24-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:57 +01:00
Eric Blake 337283dffb qapi: Drop unused 'kind' for struct/enum visit
visit_start_struct() and visit_type_enum() had a 'kind' argument
that was usually set to either the stringized version of the
corresponding qapi type name, or to NULL (although some clients
didn't even get that right).  But nothing ever used the argument.
It's even hard to argue that it would be useful in a debugger,
as a stack backtrace also tells which type is being visited.

Therefore, drop the 'kind' argument as dead.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-22-git-send-email-eblake@redhat.com>
[Harmless rebase mistake cleaned up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:57 +01:00
Eric Blake d7bce9999d qom: Swap 'name' next to visitor in ObjectPropertyAccessor
Similar to the previous patch, it's nice to have all functions
in the tree that involve a visitor and a name for conversion to
or from QAPI to consistently stick the 'name' parameter next
to the Visitor parameter.

Done by manually changing include/qom/object.h and qom/object.c,
then running this Coccinelle script and touching up the fallout
(Coccinelle insisted on adding some trailing whitespace).

    @ rule1 @
    identifier fn;
    typedef Object, Visitor, Error;
    identifier obj, v, opaque, name, errp;
    @@
     void fn
    - (Object *obj, Visitor *v, void *opaque, const char *name,
    + (Object *obj, Visitor *v, const char *name, void *opaque,
       Error **errp) { ... }

    @@
    identifier rule1.fn;
    expression obj, v, opaque, name, errp;
    @@
     fn(obj, v,
    -   opaque, name,
    +   name, opaque,
        errp)

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-20-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:56 +01:00
Eric Blake 51e72bc1dd qapi: Swap visit_* arguments for consistent 'name' placement
JSON uses "name":value, but many of our visitor interfaces were
called with visit_type_FOO(v, &value, name, errp).  This can be
a bit confusing to have to mentally swap the parameter order to
match JSON order.  It's particularly bad for visit_start_struct(),
where the 'name' parameter is smack in the middle of the
otherwise-related group of 'obj, kind, size' parameters! It's
time to do a global swap of the parameter ordering, so that the
'name' parameter is always immediately after the Visitor argument.

Additional reason in favor of the swap: the existing include/qjson.h
prefers listing 'name' first in json_prop_*(), and I have plans to
unify that file with the qapi visitors; listing 'name' first in
qapi will minimize churn to the (admittedly few) qjson.h clients.

Later patches will then fix docs, object.h, visitor-impl.h, and
those clients to match.

Done by first patching scripts/qapi*.py by hand to make generated
files do what I want, then by running the following Coccinelle
script to affect the rest of the code base:
 $ spatch --sp-file script `git grep -l '\bvisit_' -- '**/*.[ch]'`
I then had to apply some touchups (Coccinelle insisted on TAB
indentation in visitor.h, and botched the signature of
visit_type_enum() by rewriting 'const char *const strings[]' to
the syntactically invalid 'const char*const[] strings').  The
movement of parameters is sufficient to provoke compiler errors
if any callers were missed.

    // Part 1: Swap declaration order
    @@
    type TV, TErr, TObj, T1, T2;
    identifier OBJ, ARG1, ARG2;
    @@
     void visit_start_struct
    -(TV v, TObj OBJ, T1 ARG1, const char *name, T2 ARG2, TErr errp)
    +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
     { ... }

    @@
    type bool, TV, T1;
    identifier ARG1;
    @@
     bool visit_optional
    -(TV v, T1 ARG1, const char *name)
    +(TV v, const char *name, T1 ARG1)
     { ... }

    @@
    type TV, TErr, TObj, T1;
    identifier OBJ, ARG1;
    @@
     void visit_get_next_type
    -(TV v, TObj OBJ, T1 ARG1, const char *name, TErr errp)
    +(TV v, const char *name, TObj OBJ, T1 ARG1, TErr errp)
     { ... }

    @@
    type TV, TErr, TObj, T1, T2;
    identifier OBJ, ARG1, ARG2;
    @@
     void visit_type_enum
    -(TV v, TObj OBJ, T1 ARG1, T2 ARG2, const char *name, TErr errp)
    +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
     { ... }

    @@
    type TV, TErr, TObj;
    identifier OBJ;
    identifier VISIT_TYPE =~ "^visit_type_";
    @@
     void VISIT_TYPE
    -(TV v, TObj OBJ, const char *name, TErr errp)
    +(TV v, const char *name, TObj OBJ, TErr errp)
     { ... }

    // Part 2: swap caller order
    @@
    expression V, NAME, OBJ, ARG1, ARG2, ERR;
    identifier VISIT_TYPE =~ "^visit_type_";
    @@
    (
    -visit_start_struct(V, OBJ, ARG1, NAME, ARG2, ERR)
    +visit_start_struct(V, NAME, OBJ, ARG1, ARG2, ERR)
    |
    -visit_optional(V, ARG1, NAME)
    +visit_optional(V, NAME, ARG1)
    |
    -visit_get_next_type(V, OBJ, ARG1, NAME, ERR)
    +visit_get_next_type(V, NAME, OBJ, ARG1, ERR)
    |
    -visit_type_enum(V, OBJ, ARG1, ARG2, NAME, ERR)
    +visit_type_enum(V, NAME, OBJ, ARG1, ARG2, ERR)
    |
    -VISIT_TYPE(V, OBJ, NAME, ERR)
    +VISIT_TYPE(V, NAME, OBJ, ERR)
    )

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-19-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:56 +01:00
David Gibson 1114e712c9 target-ppc: Helper to determine page size information from hpte alone
h_enter() in the spapr code needs to know the page size of the HPTE it's
about to insert.  Unlike other paths that do this, it doesn't have access
to the SLB, so at the moment it determines this with some open-coded
tests which assume POWER7 or POWER8 page size encodings.

To make this more flexible add ppc_hash64_hpte_page_shift_noslb() to
determine both the "base" page size per segment, and the individual
effective page size from an HPTE alone.

This means that the spapr code should now be able to handle any page size
listed in the env->sps table.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Alexander Graf <agraf@suse.de>
2016-01-30 23:49:27 +11:00
David Gibson 61a36c9b5a target-ppc: Add new TLB invalidate by HPTE call for hash64 MMUs
When HPTEs are removed or modified by hypercalls on spapr, we need to
invalidate the relevant pages in the qemu TLB.

Currently we do that by doing some complicated calculations to work out the
right encoding for the tlbie instruction, then passing that to
ppc_tlb_invalidate_one()... which totally ignores the argument and flushes
the whole tlb.

Avoid that by adding a new flush-by-hpte helper in mmu-hash64.c.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Alexander Graf <agraf@suse.de>
2016-01-30 23:49:27 +11:00
David Gibson 7ef23068bf target-ppc: Convert mmu-hash{32,64}.[ch] from CPUPPCState to PowerPCCPU
Like a lot of places these files include a mixture of functions taking
both the older CPUPPCState *env and newer PowerPCCPU *cpu.  Move a step
closer to cleaning this up by standardizing on PowerPCCPU, except for the
helper_* functions which are called with the CPUPPCState * from tcg.

Callers and some related functions are updated as well, the boundaries of
what's changed here are a bit arbitrary.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
2016-01-30 23:37:38 +11:00
David Gibson ecbc25fa86 pseries: Allow TCG h_enter to work with hotplugged memory
The implementation of the H_ENTER hypercall for PAPR guests needs to
enforce correct access attributes on the inserted HPTE.  This means
determining if the HPTE's real address is a regular RAM address (which
requires attributes for coherent access) or an IO address (which requires
attributes for cache-inhibited access).

At the moment this check is implemented with (raddr < machine->ram_size),
but that only handles addresses in the base RAM area, not any hotplugged
RAM.

This patch corrects the problem with a new helper.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-30 23:37:38 +11:00
David Gibson 98a5d100c2 pseries: Clean up error reporting in htab migration functions
The functions for migrating the hash page table on pseries machine type
(htab_save_setup() and htab_load()) can report some errors with an
explicit fprintf() before returning an appropriate error code.  Change some
of these to use error_report() instead. htab_save_setup() is omitted for
now to avoid conflicts with some other in-progress work.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-01-30 23:37:37 +11:00
David Gibson d54e4d7659 pseries: Clean up error reporting in ppc_spapr_init()
This function includes a number of explicit fprintf()s for errors.
Change these to use error_report() instead.

Also replace the single exit(EXIT_FAILURE) with an explicit exit(1), since
the latter is the more usual idiom in qemu by a large margin.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-01-30 23:37:37 +11:00
David Gibson 1e49182d05 pseries: Clean up error handling in xics_system_init()
Use the error handling infrastructure to pass an error out from
try_create_xics() instead of assuming &error_abort - the caller is in a
better position to decide on error handling policy.

Also change the error handling from an &error_abort to &error_fatal, since
this occurs during the initial machine construction and could be triggered
by bad configuration rather than a program error.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-01-30 23:37:37 +11:00
David Gibson adf9ac50db pseries: Clean up error handling in spapr_rtas_register()
The errors detected in this function necessarily indicate bugs in the rest
of the qemu code, rather than an external or configuration problem.

So, a simple assert() is more appropriate than any more complex error
reporting.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-01-30 23:37:37 +11:00
David Gibson 14c6a89497 pseries: Clean up error handling in spapr_vga_init()
Use error_setg() to return an error rather than an explicit exit().
Previously it was an exit(0) instead of a non-zero exit code, which was
simply a bug.  Also improve the error message.

While we're at it change the type of spapr_vga_init() to bool since that's
how we're using it anyway.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-01-30 23:37:37 +11:00
David Gibson 7c150d6f04 pseries: Clean up error handling in spapr_validate_node_memory()
Use error_setg() and return an error, rather than using an explicit exit().

Also improve messages, and be more explicit about which constraint failed.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-01-30 23:37:37 +11:00
David Gibson 569f49671d pseries: Clean up error handling of spapr_cpu_init()
Currently spapr_cpu_init() is hardcoded to handle any errors as fatal.
That works for now, since it's only called from initial setup where an
error here means we really can't proceed.

However, we'll want to handle this more flexibly for cpu hotplug in future
so generalize this using the error reporting infrastructure.  While we're
at it make a small cleanup in a related part of ppc_spapr_init() to use
error_report() instead of an old-style explicit fprintf().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-01-30 23:37:37 +11:00
David Gibson f9ab1e87ed ppc: Clean up error handling in ppc_set_compat()
Current ppc_set_compat() returns -1 for errors, and also (unconditionally)
reports an error message.  The caller in h_client_architecture_support()
may then report it again using an outdated fprintf().

Clean this up by using the modern error reporting mechanisms.  Also add
strerror(errno) to the error message.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-01-30 23:37:37 +11:00
Bharata B Rao 16c25aef53 spapr: Don't create ibm,dynamic-reconfiguration-memory w/o DR LMBs
If guest doesn't have any dynamically reconfigurable (DR) logical memory
blocks (LMB), then we shouldn't create ibm,dynamic-reconfiguration-memory
device tree node.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-01-30 23:37:37 +11:00
David Gibson 27ac3e06d5 spapr: Remove abuse of rtas_ld() in h_client_architecture_support
h_client_architecture_support() uses rtas_ld() for general purpose memory
access, despite the fact that it's not an RTAS routine at all and rtas_ld
makes things more awkward.

Clean this up by replacing rtas_ld() calls with appropriate ldXX_phys()
calls.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-30 23:37:36 +11:00
David Gibson f201987b84 spapr: Remove rtas_st_buffer_direct()
rtas_st_buffer_direct() is a not particularly useful wrapper around
cpu_physical_memory_write().  All the callers are in
rtas_ibm_configure_connector, where it's better handled by local helper.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-30 23:37:36 +11:00
David Gibson c920f7b42f spapr: Small fixes to rtas_ibm_get_system_parameter, remove rtas_st_buffer
rtas_st_buffer() appears in spapr.h as though it were a widely used helper,
but in fact it is only used for saving data in a format used by
rtas_ibm_get_system_parameter().  This changes it to a local helper more
specifically for that function.

While we're there fix a couple of small defects in
rtas_ibm_get_system_parameter:
  - For the string value SPLPAR_CHARACTERISTICS, it wasn't including the
    terminating \0 in the length which it should according to LoPAPR
    7.3.16.1
  - It now checks that the supplied buffer has at least enough space for
    the length of the returned data, and returns an error if it does not.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-30 23:37:36 +11:00
Mark Cave-Ayland 03c1280bf5 macio: use the existing IDEDMA aiocb to hold the active DMA aiocb
Currently the aiocb is held within MACIOIDEState, however the IDE core code
assumes that the current actvie DMA aiocb is held in aiocb in a few places,
e.g. ide_bus_reset() and ide_reset().

Switch over to using IDEDMA aiocb to store the aiocb for the current active
DMA request so that bus resets and restarts are handled correctly. As a
consequence we can now use ide_set_inactive() rather than handling its
functionality ourselves.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-01-30 23:37:25 +11:00
Peter Maydell 0d75590d91 ppc: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-6-git-send-email-peter.maydell@linaro.org
2016-01-29 15:07:22 +00:00
Peter Maydell 3a87d00910 fpu: Replace uint32 typedef with uint32_t
Replace the uint32 softfloat-specific typedef with uint32_t.
This change was made with

find include hw fpu target-* -name '*.[ch]' | xargs sed -i -e 's/\buint32\b/uint32_t/g'

together with manual removal of the typedef definition,
manual undoing of various mis-hits, and another couple of
fixes found via test compilation.

All the uses in hw/ were using the wrong type by mistake.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: Leon Alrae <leon.alrae@imgtec.com>
Acked-by: James Hogan <james.hogan@imgtec.com>
Message-id: 1452603315-27030-5-git-send-email-peter.maydell@linaro.org
2016-01-22 15:09:21 +00:00
Daniel P. Berrange 7746abd8e9 qom: Change object property iterator API contract
Currently the ObjectProperty iterator API works as follows:

  ObjectPropertyIterator *iter;

  iter = object_property_iter_init(obj);
  while ((prop = object_property_iter_next(iter))) {
     ...
  }
  object_property_iter_free(iter);

This has the benefit that the ObjectPropertyIterator struct
can be opaque, but has the downside that callers need to
explicitly call a free function. It is also not in keeping
with iterator style used elsewhere in QEMU/GLib2.

This patch changes the API to use stack allocation instead:

  ObjectPropertyIterator iter;

  object_property_iter_init(&iter, obj);
  while ((prop = object_property_iter_next(&iter))) {
     ...
  }

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[AF: Fused ObjectPropertyIterator struct with typedef]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2016-01-18 17:47:58 +01:00
Markus Armbruster 9af9e0fed7 error: Strip trailing '\n' from error string arguments (again)
Commit 6daf194d, be62a2eb and 312fd5f got rid of a bunch, but they
keep coming back.  Tracked down with the Coccinelle semantic patch
from commit 312fd5f.

Cc: Fam Zheng <famz@redhat.com>
Cc: Peter Crosthwaite <crosthwaitepeter@gmail.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Dominik Dingel <dingel@linux.vnet.ibm.com>
Cc: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Cc: Stefan Berger <stefanb@linux.vnet.ibm.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Changchun Ouyang <changchun.ouyang@intel.com>
Cc: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Markus Armbruster <armbru@pond.sub.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1450452927-8346-17-git-send-email-armbru@redhat.com>
2016-01-13 15:16:18 +01:00
Markus Armbruster b83baa6025 spapr: Use error_reportf_err()
Not caught by Coccinelle, because we report the error only
conditionally here.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1450452927-8346-14-git-send-email-armbru@redhat.com>
2016-01-13 15:16:17 +01:00
Markus Armbruster c29b77f955 error: Use error_reportf_err() where it makes obvious sense
Done with this Coccinelle semantic patch

    @@
    expression FMT, E, S;
    expression list ARGS;
    @@
    -    error_report(FMT, ARGS, error_get_pretty(E));
    +    error_reportf_err(E, FMT/*@@@*/, ARGS);
    (
    -    error_free(E);
    |
	 exit(S);
    |
	 abort();
    )

followed by a replace of '%s"/*@@@*/' by '"' and some line rewrapping,
because I can't figure out how to make Coccinelle transform strings.

We now use the error whole instead of just its message obtained with
error_get_pretty().  This avoids suppressing its hint (see commit
50b7b00), but I can't see how the errors touched in this commit could
come with hints.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1450452927-8346-12-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-01-13 15:16:17 +01:00
Markus Armbruster 4fffeb5e19 error: Use error_report_err() where appropriate (again)
Same Coccinelle semantic patch as in commit 565f65d.

We now use the original error whole instead of just its message
obtained with error_get_pretty().  This avoids suppressing its hint
(see commit 50b7b00), but I don't think the errors touched in this
commit can come with hints.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1450452927-8346-3-git-send-email-armbru@redhat.com>
2016-01-13 15:16:16 +01:00
Markus Armbruster c525436e69 hw: Don't use hw_error() for machine initialization errors
Printing CPU registers is not helpful during machine initialization.
Moreover, these are straightforward configuration or "can get
resources" errors, so dumping core isn't appropriate either.  Replace
hw_error() by error_report(); exit(1).  Matches how we report these
errors in other machine initializations.

Cc: Richard Henderson <rth@twiddle.net>
Cc: qemu-arm@nongnu.org
Cc: qemu-ppc@nongnu.org
Cc: Guan Xuetao <gxt@mprc.pku.edu.cn>
Signed-off-by: Markus Armbruster <armbru@pond.sub.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1450370121-5768-2-git-send-email-armbru@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2016-01-13 11:58:58 +01:00
Markus Armbruster 6231a6da9f hw: Inline the qdev_prop_set_drive_nofail() wrapper
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1449764955-10741-3-git-send-email-armbru@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-01-13 11:58:58 +01:00
David Gibson 87bbdd9caf hw/ppc/spapr: fix spapr->kvm_type leak
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Alexander Graf <agraf@suse.de>
Cc: qemu-ppc@nongnu.org
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
[fixed return type of spapr_machine_finalizefn()]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-01-11 15:29:05 +11:00
Cao jin 215e209846 spapr vio: fix to incomplete QOMify
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-01-11 15:29:05 +11:00
Thomas Huth 57040d4513 hw/ppc/spapr: Use XHCI as host controller for new spapr machines
The OHCI has some bugs and performance issues, so for
newer machines it's preferable to use XHCI instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-01-11 15:29:05 +11:00
David Gibson 4b23699c82 pseries: Add pseries-2.6 machine type
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-11 15:29:05 +11:00
David Gibson fccbc78500 pseries: Improve setting of default machine version
This tweaks the way the default machine version is controlled, so that
there will be a bit less churn when each new version is introduced.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-11 15:29:05 +11:00
David Gibson fc9f38c3c0 pseries: Restructure class_options functions
Currently each of the *_class_options() functions for the pseries-2.1 ..
pseries-2.5 machine types are standalone.  This will become harder to
maintain as new versions are added.

This patch restructures them similarly to x86 where each function calls
the one from the next version, then overrides anything necessary for
compatibility with the specific version and older.

The default behaviour - that for the most recent machine are set up in
the base class initializer spapr_machine_class_init().  Previously it had
some things set up to default to older behaviour with the more recent
machines overriding it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-11 15:29:05 +11:00
David Gibson 5013c54746 pseries: DEFINE_SPAPR_MACHINE
At the moment all the class_init functions and TypeInfo structures for the
various versioned pseries machine types are open-coded.  As more versions
are created this is getting increasingly clumsy.

This patch borrows the approach used in PC, using a DEFINE_SPAPR_MACHINE()
macro to construct most of the boilerplate from simpler 'class_options' and
'instance_options' functions.

This patch makes a small semantic change - the versioned machine types are
now registered through machine_init() instead of type_init().  Since the
new way is how PC already did it, I'm assuming that's correct.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-11 15:29:05 +11:00
David Gibson f949b4e5f5 pseries: Use SET_MACHINE_COMPAT
To make the spapr_machine_*_class_init() functions a little less bulky.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-11 15:29:05 +11:00
David Gibson 0eb9054c60 pseries: Remove versions from mc->desc
Currently, the versioned spapr machine types put the machine type version
into the description string.  PC does not do this, using just the name
itself to distinguish.  Doing the same lets us move setting the description
into the common base class, simplifying the code slightly.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-11 15:29:05 +11:00
David Gibson 64f0f70a00 pseries: Remove redundant calls to spapr_machine_initfn()
The instance_init() functions for several of the pseries-x.y versioned
machine types explicitly call spapr_machine_initfn().  But that's the
instance_init function for the common parent of all those machine types,
so will already have been called beforehand by the QOM infrastructure.

Remove the redundant calls.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-11 15:29:05 +11:00
David Gibson 1c5f29bbc8 pseries: Rearrange versioned machine type code
hw/ppc/spapr.c has a number of definitions related to the various versioned
machine types ("pseries-2.1" .. "pseries-2.5") it defines.  These are
mostly arranged by type of function first, then machine version second, and
it's not consistent about whether it goes in increasing or decreasing
version order.

This rearranges the code to keep all the definitions for a particular
machine version together, and arrange then consistently in order most
recent to least recent.

This brings us closer to matching the way PC does things, and makes later
cleanups easier to follow.

Apart from adding some comments marking each section, this is a pure
mechanical rearrangement with no semantic changes.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-11 15:29:04 +11:00
David Gibson aec39c5349 pseries: Remove redundant setting of mc->name for pseries-2.5 machine
98cec76 "machine: Set MachineClass::name automatically" removed the setting
of mc->name for the pseries machine types, since it can be derived
automatically from the type names constructed with MACHINE_TYPE_NAME().

Unfortunately fb0fc8f "spapr: Create pseries-2.5 machine" went in later and
brought one of them back.

This removes it again.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-01-11 15:29:04 +11:00
Alexey Kardashevskiy 3dc0a66d26 spapr: Add /system-id
Section B.6.2.1 Root Node Properties of PAPR specification defines
a set of properties which shall be present in the device tree root,
one of these properties is "system-id" which "should be unique across
all systems and all manufacturers". Since UUID is meant to be unique,
it makes sense to use it as "system-id".

This adds "system-id" property to the device tree root when not empty.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-01-11 15:29:04 +11:00
Thomas Huth 54c6de864f hw/ppc/spapr_rtc: Remove bad class_size value
class_size = sizeof(XICSStateClass) does not make much sense
in the RTC code and likely was just a copy-n-paste error.
Let's simply remove it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-01-11 13:25:40 +11:00
Markus Armbruster ab8bf1d735 spapr_drc: Change value of property "fdt" from null back to {}
prop_get_fdt() misuses the visitor API: when fdt is null, it doesn't
visit anything.  object_property_get_qobject() happily
object_property_get_qobject().  Amazingly, the latter survives the
misuse.  Turns out we've papered over it long before prop_get_fdt()
existed, in commit 1d10b44.

However, commit 6c2f9a1 changed how we paper over it, and as a side
effect changed qom-get's value from {} to null.  Change it right back
by fixing the visitor misuse.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-12-04 16:50:59 +11:00
Markus Armbruster c401ae8c9c spapr_drc: Make device "spapr-dr-connector" unavailable with -device
It should only be created via spapr_dr_connector_new().  Attempting to
create it with -device crashes.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-12-04 10:56:29 +11:00
Markus Armbruster c75304a139 spapr_drc: Handle visitor errors properly
Since prop_get_fdt() is only used with QmpOutputVisitor, errors
shouldn't actually happen, so this is only a latent bug.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-12-04 10:56:29 +11:00
Peter Maydell e2a176dfda hw/ppc/ppc405_boards: Fix infinite recursion by converting taihu_cpld from old_mmio
The taihu_cpld_writel() function had an obvious typo that meant that
if it was ever called it would go into an infinite recursion. Newer
versions of clang will detect and warn about this:
  hw/ppc/ppc405_boards.c:481:1: warning: all paths through this function will call itself [-Winfinite-recursion]

Fix this by converting taihu_cpld from the legacy old_mmio accessors
to new-style ones, with an impl {} declaration to cause the core
memory code to do the splitting of 16 bit and 32 bit accesses into
multiple 8-bit accesses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-11-30 19:39:00 +11:00
Thomas Huth 9b7a70e63e hw/ppc/spapr: Remove duplicated "pseries" alias
The "pseries" alias is currently set twice, one time for the
pseries-2.4 machine and one time for the "pseries-2.5" machine.
To avoid confusion with the alias, let's remove the one from
the older machine class. And while we're at it, also remove
the "is_default = 0" there since the is_default variable
should be set to zero by default already.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-11-30 19:39:00 +11:00
Stefano Dong (董兴水) 903a41d341 Fix memory leak on error
hw/ppc/spapr.c: Fix memory leak on error, it was introduced in bc09e0611
hw/acpi/memory_hotplug.c: Fix memory leak on error, it was introduced in 34f2af3d

Signed-off-by: Stefano Dong (董兴水) <opensource.dxs@aliyun.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-26 14:27:52 +02:00
Daniel P. Berrange 9a842f7d3c ppc: Convert spapr code to use object property iterators
Stop directly accessing the Object::properties field data
structure and instead use the formal object property iterator
APIs. This insulates the code from future data structure
changes in the Object struct.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Tested-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2015-11-18 21:13:49 +01:00
Mark Cave-Ayland cffc331a31 cuda.c: add delay to setting of SR_INT bit
MacOS 9 is racy when it comes to accessing the shift register. Fix this by
introducing a small delay between data accesses and raising the SR_INT
interrupt bit.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-11-12 13:15:55 +11:00
Alexander Graf 72f1f97d49 PPC: mac99: Always add USB controller
The mac99 machines always have a USB controller. Usually not having one around
doesn't hurt quite as much, but Mac OS 9 really really wants one or it crashes
on bootup.

So always add OHCI to make it happy.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-11-12 13:15:54 +11:00
Bharata B Rao b41d320fef spapr: Handle failure of KVM_PPC_ALLOCATE_HTAB ioctl
KVM_PPC_ALLOCATE_HTAB ioctl can return -ENOMEM for KVM guests and QEMU
never handled this correctly. But this didn't cause any problems till
now as KVM_PPC_ALLOCATE_HTAB ioctl returned with smaller than requested
HTAB when enough contiguous memory wasn't available in the host.
After the proposed kernel change: https://patchwork.ozlabs.org/patch/530501/,
KVM_PPC_ALLOCATE_HTAB ioctl will not fallback to lower sized HTAB
allocation and will fail if requested HTAB size can't be met.

Check for such failures in QEMU and abort appropriately. This will
prevent guest kernel from hanging/freezing during early boot by doing
graceful exit when host is unable to allocate requested HTAB.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-11-11 13:29:04 +11:00
Dr. David Alan Gilbert a3e06c3d13 Rename save_live_complete to save_live_complete_precopy
In postcopy we're going to need to perform the complete phase
for postcopiable devices at a different point, start out by
renaming all of the 'complete's to make the difference obvious.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 14:51:49 +01:00
Cornelia Huck 80fd50f96b ppc/spapr: add 2.4 compat props
HW_COMPAT_2_4 will become non-empty: prepare for it.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-id: 1444991154-79217-3-git-send-email-cornelia.huck@de.ibm.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-10-29 17:59:26 +00:00
Michael S. Tsirkin d6a9b0b89d Revert "memhp: extend address auto assignment to support gaps"
This reverts commit df0acded19.

There's no point to it now that the only user has been reverted.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-29 11:11:07 +02:00
Paolo Bonzini 659f7f6556 prep: do not use CPU_LOG_IOPORT, convert to tracepoints
These messages are disabled by default; a perfect usecase for tracepoints.
Convert them over.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-10-23 12:38:28 +11:00
Benjamin Herrenschmidt 90da0d5a70 ppc/spapr: Add "ibm,pa-features" property to the device-tree
LoPAPR defines a "ibm,pa-features" per-CPU device tree property which
describes extended features of the Processor Architecture.

This adds the property to the device tree. At the moment this is the
copy of what pHyp advertises except "I=1 (cache inhibited) Large Pages"
which is enabled for TCG and disabled when running under HV KVM host
with 4K system page size.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: rebased, changed commit log, moved ci_large_pages initialization,
renamed pa_features arrays]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-10-23 12:22:40 +11:00
David Gibson 185181f883 spapr_pci: Allow VFIO devices to work on the normal PCI host bridge
The core VFIO infrastructure more or less allows VFIO devices to work
on any normal guest PCI host bridge (PHB) without extra logic.
However, the "spapr-pci-host-bridge" device (as opposed to the special
"spapr-pci-vfio-host-bridge" device) breaks this by using a partially
KVM accelerated implementation of the guest kernel IOMMU which won't
work with VFIO devices, without additional kernel support.

This patch allows VFIO devices to work on the spapr-pci-host-bridge,
by having it switch off KVM TCE acceleration when a VFIO device is
added to the PHB (either on startup, or by hotplug).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2015-10-23 10:38:10 +11:00
David Gibson c10325d6f9 spapr_iommu: Provide a function to switch a TCE table to allowing VFIO
Because of the way non-VFIO guest IOMMU operations are KVM accelerated, not
all TCE tables (guest IOMMU contexts) can support VFIO devices.  Currently,
this is decided at creation time.

To support hotplug of VFIO devices, we need to allow a TCE table which
previously didn't allow VFIO devices to be switched so that it can.  This
patch adds an spapr_tce_set_need_vfio() function to do this, by
reallocating the table in userspace if necessary.

Currently this doesn't allow the KVM acceleration to be re-enabled if all
the VFIO devices are removed.  That's an optimization for another time.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2015-10-23 10:38:10 +11:00
David Gibson 6a81dd172c spapr_iommu: Rename vfio_accel parameter
The vfio_accel parameter used when creating a new TCE table (guest IOMMU
context) has a confusing name.  What it really means is whether we need the
TCE table created to be able to support VFIO devices.

VFIO is relevant, because when available we use in-kernel acceleration of
the TCE table, but that may not work with VFIO devices because updates to
the table are handled in kernel, bypass qemu and so don't hit qemu's
infrastructure for keeping the VFIO host IOMMU state in sync with the guest
IOMMU state.

Rename the parameter to "need_vfio" throughout.  This is a cosmetic change,
with no impact on the logic.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2015-10-23 10:38:10 +11:00
David Gibson f93caaac36 spapr_pci: Allow PCI host bridge DMA window to be configured
At present the PCI host bridge (PHB) for the pseries machine type has a
fixed DMA window from 0..1GB (in PCI address space) which is mapped to real
memory via the PAPR paravirtualized IOMMU.

For better support of VFIO devices, we're going to want to allow for
different configurations of the DMA window.

Eventually we'll want to allow the guest itself to reconfigure the window
via the PAPR dynamic DMA window interface, but as a preliminary this patch
allows the user to reconfigure the window with new properties on the PHB
device.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2015-10-23 10:38:10 +11:00
Thomas Huth fd5da5c472 spapr: Add "slb-size" property to CPU device tree nodes
According to a commit message in the Linux kernel (see here
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b60c31d85a2a
for example), the name of the property that carries the information
about the number of SLB entries should be called "slb-size", and
not "ibm,slb-size". The Linux kernel can deal with both names, but
to be on the safe side we should support the official name, too.

[Now that LoPAPR is public, the relevant requirement can be found in
section C.6.1.8 --dwg]

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-10-23 10:38:10 +11:00
Bharata B Rao 7735fedaf4 spapr: Abort when HTAB of requested size isn't allocated
Terminate the guest when HTAB of requested size isn't allocated by
the host.

When memory hotplug is attempted on a guest that has booted with
less than requested HTAB size, the guest kernel will not be able
to gracefully fail the hotplug request. This patch will ensure that
we never end up in a situation where memory hotplug fails due to
less than requested HTAB size.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-10-23 10:38:10 +11:00
Bharata B Rao b817772a25 spapr: Allocate HTAB from machine init
Allocate HTAB from ppc_spapr_init() so that we can abort the guest
if requested HTAB size is't allocated by the host. However retain the
htab reset call in spapr_reset_htab() so that HTAB gets reset (and
not allocated) during machine reset.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-10-23 10:38:10 +11:00
Benjamin Herrenschmidt b798c19057 ppc/spapr: Allow VIRTIO_VGA
It works fine with the Linux driver out of the box

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2015-10-20 09:26:36 +02:00
Christopher Covington 4a7428c5a7 s/cpu_get_real_ticks/cpu_get_host_ticks/
This should help clarify the purpose of the function that returns
the host system's CPU cycle count.

Signed-off-by: Christopher Covington <cov@codeaurora.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
ppc portion
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-10-08 19:46:01 +03:00
Igor Mammedov df0acded19 memhp: extend address auto assignment to support gaps
setting gap to TRUE will make sparse DIMM
address auto allocation, leaving gaps between
a new DIMM address and preceeding existing DIMM.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-02 17:04:32 +03:00
Peter Crosthwaite 4ecd4d16a0 ppc: Rename ELF_MACHINE to be PPC specific
Rename ELF_MACHINE to be PPC specific. This is used as-is by the
various PPC bootloaders and is locally defined to ELF_MACHINE in linux
user in PPC specific ifdeffery.

This removes another architecture specific definition from the global
namespace (as desired by multi-arch).

Cc: Alexander Graf <agraf@suse.de>
Cc: qemu-ppc@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-By: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-09-25 12:04:44 +02:00
Gavin Shan d76548a98f sPAPR: Enable EEH on VFIO PCI device only
This checks if the PCI device retrieved from the PCI device address
is VFIO PCI device when enabling EEH functionality. If it's not
VFIO PCI device, the EEH functonality isn't enabled.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Gavin Shan 47445c80fb sPAPR: Revert don't enable EEH on emulated PCI devices
This reverts commit 7cb18007 ("sPAPR: Don't enable EEH on emulated
PCI devices") as rtas_ibm_set_eeh_option() isn't the right place
to check if there has the corresponding PCI device for the input
address, which can be PE address, not PCI device address.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Thomas Huth 4d9392be6c ppc/spapr: Implement H_RANDOM hypercall in QEMU
The PAPR interface defines a hypercall to pass high-quality
hardware generated random numbers to guests. Recent kernels can
already provide this hypercall to the guest if the right hardware
random number generator is available. But in case the user wants
to use another source like EGD, or QEMU is running with an older
kernel, we should also have this call in QEMU, so that guests that
do not support virtio-rng yet can get good random numbers, too.

This patch now adds a new pseudo-device to QEMU that either
directly provides this hypercall to the guest or is able to
enable the in-kernel hypercall if available. The in-kernel
hypercall can be enabled with the use-kvm property, e.g.:

 qemu-system-ppc64 -device spapr-rng,use-kvm=true

For handling the hypercall in QEMU instead, a "RngBackend" is
required since the hypercall should provide "good" random data
instead of pseudo-random (like from a "simple" library function
like rand() or g_random_int()). Since there are multiple RngBackends
available, the user must select an appropriate back-end via the
"rng" property of the device, e.g.:

 qemu-system-ppc64 -object rng-random,filename=/dev/hwrng,id=gid0 \
                   -device spapr-rng,rng=gid0 ...

See http://wiki.qemu-project.org/Features-Done/VirtIORNG for
other example of specifying RngBackends.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Thomas Huth ef001f069e ppc/spapr: Fix buffer overflow in spapr_populate_drconf_memory()
The buffer that is allocated in spapr_populate_drconf_memory()
is used for setting both, the "ibm,dynamic-memory" and the
"ibm,associativity-lookup-arrays" property. However, only the
size of the first one is taken into account when allocating the
memory. So if the length of the second property is larger than
the length of the first one, we run into a buffer overflow here!
Fix it by taking the length of the second property into account,
too.

Fixes: "spapr: Support ibm,dynamic-reconfiguration-memory" patch
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
David Gibson 20bb648dca spapr: Fix default NUMA node allocation for threads
At present, if guest numa nodes are requested, but the cpus in each node
are not specified, spapr just uses the default behaviour or assigning each
vcpu round-robin to nodes.

If smp_threads != 1, that will assign adjacent threads in a core to
different NUMA nodes.  As well as being just weird, that's a configuration
that can't be represented in the device tree we give to the guest, which
means the guest and qemu end up with different ideas of the NUMA topology.

This patch implements mc->cpu_index_to_socket_id in the spapr code to
make sure vcpus get assigned to nodes only at the socket granularity.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2015-09-23 10:51:11 +10:00
Bharata B Rao 0a4178692c spapr: Move memory hotplug to RTAS_LOG_V6_HP_ID_DRC_COUNT type
Till now memory hotplug used RTAS_LOG_V6_HP_ID_DRC_INDEX hotplug type
which meant that we generated one hotplug type of EPOW event for every
256MB (SPAPR_MEMORY_BLOCK_SIZE). This quickly overruns the kernel
rtas log buffer thus resulting in loss of memory hotplug events. Switch
to RTAS_LOG_V6_HP_ID_DRC_COUNT hotplug type for memory so that we
generate only one event per hotplug request.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Bharata B Rao 7a36ae7a9f spapr: Support hotplug by specifying DRC count
Support hotplug identifier type RTAS_LOG_V6_HP_ID_DRC_COUNT that allows
hotplugging of DRCs by specifying the DRC count.

While we are here, rename

spapr_hotplug_req_add_event() to spapr_hotplug_req_add_by_index()
spapr_hotplug_req_remove_event() to spapr_hotplug_req_remove_by_index()

so that they match with spapr_hotplug_req_add_by_count().

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Bharata B Rao e8f986fc57 spapr: Revert to memory@XXXX representation for non-hotplugged memory
Don't represent non-hotluggable memory under drconf node. With this
we don't have to create DRC objects for them.

The effect of this patch is that we revert back to memory@XXXX representation
for all the memory specified with -m option and represent the cold
plugged memory and hot-pluggable memory under
ibm,dynamic-reconfiguration-memory.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00
Bharata B Rao 6663864e95 spapr: Populate ibm,associativity-lookup-arrays correctly for non-NUMA
When NUMA isn't configured explicitly, assume node 0 is present for
the purpose of creating ibm,associativity-lookup-arrays property
under ibm,dynamic-reconfiguration-memory DT node. This ensures that
the associativity index property is correctly updated in ibm,dynamic-memory
for the LMB that is hotplugged.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2015-09-23 10:51:11 +10:00