Commit Graph

114 Commits

Author SHA1 Message Date
David Gibson 12f421745c pseries: Move sPAPR RTC code into its own file
At the moment the RTAS (firmware/hypervisor) time of day functions are
implemented in spapr_rtas.c along with a bunch of other things.  Since
we're going to be expanding these a bit, move the RTAS RTC related code
out into new file spapr_rtc.c.  Also add its own initialization function,
spapr_rtc_init() called from the main machine init routine.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-03-09 14:59:56 +01:00
Alexey Kardashevskiy b194df478a spapr-pci: Enable huge BARs
At the moment sPAPR only supports 512MB window for MMIO BARs. However
modern devices might want bigger 64bit BARs.

This extends MMIO window from 512MB to 62GB (aligned to
SPAPR_PCI_WINDOW_SPACING) and advertises it in 2 records in
the PHB "ranges" property. 32bit gets the space from
SPAPR_PCI_MEM_WIN_BUS_OFFSET till the end of 4GB, 64bit gets the rest
of the space. If no space is left, 64bit range is not advertised.

The MMIO space size is set to old value of 0x20000000 by default
for pseries machines older than 2.3.

The approach changes the device tree which is a guest visible change, however
it won't break migration as:
1. we do not support migration to older QEMU versions
2. migration to newer QEMU will migrate the device tree as well and since
the new layout only extends the old one and does not change address mappigns,
no breakage is expected here too.

SLOF change is required to utilize this extension.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-03-09 14:59:54 +01:00
Alexey Kardashevskiy 3dab024430 spapr: Add pseries-2.3 machine
The next patch will make MMIO space bigger and keep the old value for
older pseries machines.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-03-09 14:59:54 +01:00
Peter Maydell 2dffe5516e NUMA fixes queue
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJU639qAAoJECgHk2+YTcWmIC0QAK36G9LpeDOJONxt7LNOCMes
 Oplwp11RRA5zs5YHVCzTL0NMZEf59xRs/35r1iIrt238clmFGpUdZRpdlsZdDMZL
 ThwxgrV7/aZH/+HyZVSTCh5FL+fTyl+DjVQ1cZqNUWQRQ6KO2AjSxA53gCoFEZ+L
 1Vg6A41u1UOdO3tMOosuDcZtkN4fXevHTuIFFzfzW4WhJuctU2P/VPsHKxESKxYH
 1Kz0S9CCr6meIqZ8DE9y5qhCUdXBj9VYbEY2JZUYVaeB8q+smY7Lz1cV7MABrD3T
 EoJa3/53b4DFl4wNkLJsuqQNFHiOrQadgmbhocgEYp3u96/7cKa+pZ8mTKgvZ14D
 kxtpcL1TeiyA3b7UfITPhqLY3mKvN7UIKh/vo0haAxj5B3H6QhemkO76oA1/zkJR
 zI3mm6trnxurZm1gBtkZwPbKoDi7OuS7lYxW3GT7K5nJBMCEAbFGXFzmoHomVDVZ
 phYGwihzUg/LWMwaZRT0bbUd2GOVAXGY7Zy9Kg0p8nvPtcEYDwp0spQDfyrnwEWu
 Bohi80x97ayRXrvY1mP5mdYg1Jj8jR4qPdwnoW6M2UHXvtdQsjfFoSj4/xbVMCb9
 l5y1vH+DHG/ZNQB0URjBoXmttWkZjpbiAbo0daPFDaZ1QRfdXx6JgdkTKtA8ozQ9
 Xvh4hVGPeLAPvRjzaesa
 =0V2i
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/numa-pull-request' into staging

NUMA fixes queue

# gpg: Signature made Mon Feb 23 19:28:42 2015 GMT using RSA key ID 984DC5A6
# gpg: Can't check signature: public key not found

* remotes/ehabkost/tags/numa-pull-request:
  numa: Rename set_numa_modes() to numa_post_machine_init()
  numa: Rename option parsing functions
  numa: Move QemuOpts parsing to set_numa_nodes()
  numa: Make max_numa_nodeid static
  numa: Move NUMA globals to numa.c
  vl.c: Remove unnecessary zero-initialization of NUMA globals
  numa: Move NUMA declarations from sysemu.h to numa.h

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-03-02 12:13:45 +00:00
Eduardo Habkost e35704ba9c numa: Move NUMA declarations from sysemu.h to numa.h
Not all sysemu.h users need the NUMA declarations, and keeping them in a
separate file makes it easier to see what are the interfaces provided by
numa.c.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2015-02-23 15:39:27 -03:00
Markus Armbruster c86580b889 PPC: Don't use legacy -usbdevice support for setting up board
It's tempting, because usbdevice_create() is so simple to use.  But
there's a lot of unwanted complexity behind the simple interface.
Switch to usb_create_simple().

Cc: Alexander Graf <agraf@suse.de>
Cc: qemu-ppc@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2015-02-18 10:53:10 +01:00
Gonglei 627b84f406 fw_cfg: fix typos in comments: patch -> path
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-02-10 09:27:19 +03:00
Marcel Apfelbaum 4ee9ced979 hw/ppc/spapr: simplify usb controller creation logic
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07 16:16:29 +01:00
Marcel Apfelbaum 09f28e5b51 hw/usb: simplified usb_enabled
The argument is not longer used and the implementation
uses now QOM instead of QemuOpts.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07 16:16:29 +01:00
Marcel Apfelbaum c760dbb9dc hw/ppc: modified the condition for usb controllers to be created for some ppc machines
Some ppc machines create a default usb controller based on a 'machine condition'.
Until now the logic was: create the usb controller if:
 -  the usb option was supplied in cli and value is true or
 -  the usb option was absent and both set_defaults and the machine
    condition were true.

Modified the logic to:
Create the usb controller if:
 - the machine condition is true and defaults are enabled or
 - the usb option is supplied and true.

The main for this is to simplify the usb_enabled method.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07 16:16:28 +01:00
Peter Maydell 2f285bdd54 target-ppc: Cast ssize_t to size_t before printing with %zx
The mingw32 compiler complains about trying to print variables of type
ssize_t with the %z format string specifier. Since we're printing it
as unsigned hex anyway, cast to size_t to silence the warning.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07 16:16:28 +01:00
Samuel Mendoza-Jonas e6b8fd246c spapr: Fix stale HTAB during live migration (TCG)
If a TCG guest reboots during a running migration HTAB entries are not
marked dirty, and the destination boots with an invalid HTAB.

When a reboot occurs, explicitly mark the current HTAB dirty after
clearing it.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07 16:16:26 +01:00
Samuel Mendoza-Jonas 338c25b692 spapr: Fix integer overflow during migration (TCG)
The n_valid and n_invalid fields are unsigned short integers but it is
possible to have more than 65535 entries in a contiguous hunk, overflowing
the field. This results in an incorrect HTAB being sent to the destination
during migration.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07 16:16:26 +01:00
Samuel Mendoza-Jonas 01a579729b spapr: Fix stale HTAB during live migration (KVM)
If a guest reboots during a running migration, changes to the
hash page table are not necessarily updated on the destination.
Opening a new file descriptor to the HTAB forces the migration
handler to resend the entire table.

Signed-off-by: Samuel Mendoza-Jonas <sam.mj@au1.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2015-01-07 16:16:26 +01:00
Marcel Apfelbaum 49d2e648e8 machine: remove qemu_machine_opts global list
QEMU has support for options per machine, keeping
a global list of options is no longer necessary.

Signed-off-by: Marcel Apfelbaum <marcel.a@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Greg Bellows <greg.bellows@linaro.org>
Message-id: 1418217570-15517-2-git-send-email-marcel.a@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2014-12-22 23:12:27 +00:00
Alexander Graf 9e3f973335 spapr: Allow dynamic creation of PHB
Now that we finally check for presence of dangling sysbus devices, make check
started complaining that the sPAPR PHB is one such device.

However, it really isn't. The spapr PHB is not really a traditional sysbus
device, but much more a special spapr pv device which is already able to get
created dynamically.

Move spapr to its own dynamic sysbus check handling and allow PHB devices to
get allocated dynamically.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:15 +01:00
David Gibson 4aee73623d spapr: Cleanup machine naming conventions, and prepare for 2.2 release
As of qemu-2.1, spapr/pseries, has a set of versioned machine classes to
represent the machine type as it appeared to the guest in different qemu
versions.  This allows for safe migration of guests between current and
future qemu versions.

However, these are organized a bit differently from those for PC: on PC,
the default plain "pc" machine type is just an alias for the most recent
versioned machine type.  In sPAPR, it names the base machine class from
which the versioned types are derived.

The PC approach is preferable; it makes it clearer which explicit version
is the current one.  Additionally updating the "current" machine as the
base class makes it even more likely than otherwise to incorrectly alter
the versioned machines' behaviour when updating the current machine.

Therefore this patch changes sPAPR to the PC approach - the base class
becomes abstract, and plain "pseries" becomes an alias for the most
recent versioned machine class.  Since qemu-2.1 is now released, we also
create a new pseries-2.2 machine type, to incorporate changes during this
development cycle (for now it is identical to pseries-2.1).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-11-04 23:26:11 +01:00
Michael S. Tsirkin 68a27b208a virtio-pci: fix migration for pci bus master
Current support for bus master (clearing OK bit) together with the need to
support guests which do not enable PCI bus mastering, leads to extra state in
VIRTIO_PCI_FLAG_BUS_MASTER_BUG bit, which isn't robust in case of cross-version
migration for the case when guests use the device before setting DRIVER_OK.

Rip out this code, and replace it:
-   Modern QEMU doesn't need VIRTIO_PCI_FLAG_BUS_MASTER_BUG
    so just drop it for latest machine type.
-   For compat machine types, set PCI_COMMAND if DRIVER_OK
    is set.

As this is needed for 2.1 for both pc and ppc, move PC_COMPAT macros from pc.h
to a new common header.

Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
2014-11-02 12:03:03 +02:00
Markus Armbruster 4be746345f hw: Convert from BlockDriverState to BlockBackend, mostly
Device models should access their block backends only through the
block-backend.h API.  Convert them, and drop direct includes of
inappropriate headers.

Just four uses of BlockDriverState are left:

* The Xen paravirtual block device backend (xen_disk.c) opens images
  itself when set up via xenbus, bypassing blockdev.c.  I figure it
  should go through qmp_blockdev_add() instead.

* Device model "usb-storage" prompts for keys.  No other device model
  does, and this one probably shouldn't do it, either.

* ide_issue_trim_cb() uses bdrv_aio_discard() instead of
  blk_aio_discard() because it fishes its backend out of a BlockAIOCB,
  which has only the BlockDriverState.

* PC87312State has an unused BlockDriverState[] member.

The next two commits take care of the latter two.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 14:02:25 +02:00
Markus Armbruster fa1d36df74 block: Eliminate DriveInfo member bdrv, use blk_by_legacy_dinfo()
The patch is big, but all it really does is replacing

    dinfo->bdrv

by

    blk_bs(blk_by_legacy_dinfo(dinfo))

The replacement is repetitive, but the conversion of device models to
BlockBackend is imminent, and will shorten it to just
blk_legacy_dinfo(dinfo).

Line wrapping muddies the waters a bit.  I also omit tests whether
dinfo->bdrv is null, because it never is.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Benoît Canet <benoit.canet@nodalink.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-10-20 13:41:27 +02:00
zhanghailiang 9d632f5f68 Fix typos and misspellings in comments
formated -> formatted
gaurantee -> guarantee
shear -> sheer

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2014-09-20 17:55:53 +04:00
Anton Blanchard 85423d90c7 hypervisor property clashes with hypervisor node
dtc fails on a recent QEMU snapshot:

ERROR (name_properties): "name" property in /hypervisor#1 is incorrect ("hypervisor" instead of base node name)

Looking at the device tree we have a hypervisor property:

# lsprop hypervisor
hypervisor       "kvm"

But we also have a hypervisor node, with a name that doesn't match:

# lsprop hypervisor#1/
name             "hypervisor"
compatible       "linux,kvm"
linux,phandle    7e5eb5d8 (2120136152)

Commit c08ce91d309c (spapr: add uuid/host details to device tree)
looks to have collided with an earlier patch. Remove the hypervisor
property.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:54 +02:00
Greg Kurz 8c46f7ec85 spapr_pci: map the MSI window in each PHB
On sPAPR, virtio devices are connected to the PCI bus and use MSI-X.
Commit cc943c36fa has modified MSI-X
so that writes are made using the bus master address space and follow
the IOMMU path.

Unfortunately, the IOMMU address space address space does not have an
MSI window: the notification is silently dropped in unassigned_mem_write
instead of reaching the guest... The most visible effect is that all
virtio devices are non-functional on sPAPR since then. :(

This patch does the following:
1) map the MSI window into the IOMMU address space for each PHB
   - since each PHB instantiates its own IOMMU address space, we
     can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW)
   - no real need to keep the MSI window setup in a separate function,
     the spapr_pci_msi_init() code moves to spapr_phb_realize().

2) kill the global MSI window as it is not needed in the end

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:53 +02:00
Nikunj A Dadhania 9674a35626 ppc/spapr: Fix MAX_CPUS to 255
MAX_CPUS 256 is inconsistent with qemu supporting upto 255 cpus. This
MAX_CPUS number was percolated back to "virsh capabilities" with wrong
max_cpus.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:49 +02:00
Benjamin Herrenschmidt b7d1f77ada spapr: Locate RTAS and device-tree based on real RMA
We currently calculate the final RTAS and FDT location based on
the early estimate of the RMA size, cropped to 256M on KVM since
we only know the real RMA size at reset time which happens much
later in the boot process.

This means the FDT and RTAS end up right below 256M while they
could be much higher, using precious RMA space and limiting
what the OS bootloader can put there which has proved to be
a problem with some OSes (such as when using very large initrd's)

Fortunately, we do the actual copy of the device-tree into guest
memory much later, during reset, late enough to be able to do it
using the final RMA value, we just need to move the calculation
to the right place.

However, RTAS is still loaded too early, so we change the code to
load the tiny blob into qemu memory early on, and then copy it into
guest memory at reset time. It's small enough that the memory usage
doesn't matter.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: fixed errors from checkpatch.pl, defined RTAS_MAX_ADDR]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix compilation on 32bit hosts]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy c3b4f589d8 spapr: Fix ibm, associativity for memory nodes
We want the associtivity lists of memory and CPU nodes to match but
memory nodes have incorrect domain#3 which is zero for CPU so they won't
match.

This clears domain#3 in the list to match CPUs associtivity lists.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy b082d65a30 spapr: Add a helper for node0_size calculation
In multiple places there is a node0_size variable calculation
which assumes that NUMA node #0 and memory node #0 are the same
things which they are not. Since we are going to change it and
do not want to change it in multiple places, let's make a helper.

This adds a spapr_node0_size() helper and makes use of it.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy 6010818c30 spapr: Split memory nodes to power-of-two blocks
Linux kernel expects nodes to have power-of-two size and
does WARN_ON if this is not the case:
[    0.041456] WARNING: at drivers/base/memory.c:115
which is:

===
	/* Validate blk_sz is a power of 2 and not less than section size */
	if ((block_sz & (block_sz - 1)) || (block_sz < MIN_MEMORY_BLOCK_SIZE)) {
        	WARN_ON(1);
	        block_sz = MIN_MEMORY_BLOCK_SIZE;
	}
===

This splits memory nodes into set of smaller blocks with
a size which is a power of two. This makes sure the start
address of every node is aligned to the node size.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: squash windows compile fix in]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy 7db8a127e3 spapr: Refactor spapr_populate_memory() to allow memoryless nodes
Current QEMU does not support memoryless NUMA nodes, however
actual hardware may have them so it makes sense to have a way
to emulate them in QEMU. This prepares SPAPR for that.

This moves 2 calls of spapr_populate_memory_node() into
the existing loop over numa nodes so first several nodes may
have no memory and this still will work.

If there is no numa configuration, the code assumes there is just
a single node at 0 and it has all the guest memory.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:48 +02:00
Alexey Kardashevskiy 81014ac2b8 spapr: Use DT memory node rendering helper for other nodes
This finishes refactoring by using the spapr_populate_memory_node helper
for all nodes and removing leftovers from spapr_populate_memory().

This is not a part of the previous patch because the patches look
nicer apart.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Alexey Kardashevskiy 26a8c353bf spapr: Move DT memory node rendering to a helper
This moves recurring bits of code related to memory@xxx nodes
creation to a helper.

This makes use of the new helper for node@0.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Gonglei a21a7a7012 spapr: fix possible memory leak
get_boot_devices_list() will malloc memory, spapr_finalize_fdt
doesn't free it.

Signed-off-by: Chenliang <chenliang88@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Nikunj A Dadhania ef9514431d spapr: add uuid/host details to device tree
Useful for identifying the guest/host uniquely within the
guest. Adding following properties to the guest root node.

vm,uuid - uuid of the guest
host-model - Host model number
host-serial - Host machine serial number
hypervisor type - Tells its "kvm"

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:47 +02:00
Nikunj A Dadhania 2e14072f9e ppc: spapr-rtas - implement os-term rtas call
PAPR compliant guest calls this in absence of kdump. This finally
reaches the guest and can be handled according to the policies set by
higher level tools(like taking dump) for further analysis by tools like
crash.

Linux kernel calls ibm,os-term when extended property of os-term is set.
This makes sure that a return to the linux kernel is gauranteed.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[agraf: reduce RTAS_TOKEN_MAX]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-09-08 12:50:45 +02:00
Alexey Kardashevskiy 3431648272 spapr: Add support for new NMI interface
This implements an NMI interface POWERPC SPAPR machine.
This enables an "nmi" HMP/QMP command supported on SPAPR.

This calls POWERPC_EXCP_RESET (vector 0x100) in the guest to deliver NMI
to every CPU. The expected result is XMON (in-kernel debugger) invocation.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2014-08-25 13:25:16 +02:00
Alexey Kardashevskiy f92f5da108 spapr: Enable use of huge pages
0b183fc87 "memory: move mem_path handling to
memory_region_allocate_system_memory" disabled -mempath use for all
machines that do not use memory_region_allocate_system_memory() to
register RAM. Since SPAPR uses memory_region_init_ram(), the huge pages
support was disabled for it.

This replaces memory_region_init_ram()+vmstate_register_ram_global() with
memory_region_allocate_system_memory() to get huge pages back.

This changes RAM size from (ram_limit - rma_alloc_size) to ram_limit as
the previous patch moved RMA memory region allocation after RAM allocation
and therefore this change does not have immediate effect but simplifies
the code.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:59 +02:00
Alexey Kardashevskiy 658fa66b81 spapr: Move RMA memory region registration code
PPC970 does not support VRMA (virtual RMA) so real memory required
for SLOF to execute must be allocated by the KVM_ALLOCATE_RMA ioctl.
Later this memory is used as a part of the guest RAM area.
The RMA allocating code also registers a memory region for this piece
of RAM.

We are going to simplify memory regions layout: RMA memory region
will be a subregion in the RAM memory region, both starting from zero.
This way we will not have to take care of start address alignment for
the piece of RAM next to the RMA.

This moves memory region business closer to the RAM memory region
creation/allocation code.

As this is a mechanical patch, no change in behaviour is expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: fix compilation on non-kvm systems]
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-15 16:11:59 +02:00
Laurent Dufour 4bce526ec4 target-ppc: KVMPPC_H_CAS fix cpu-version endianess
During KVMPPC_H_CAS processing, the cpu-version updated value is stored
without taking care of the current endianess. As a consequence, the guest
may not switch to the right CPU model, leading to unexpected results.

If needed, the value is now converted.

Fixes: 6d9412ea81 ("target-ppc: Implement "compat" CPU option")
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-07-08 12:10:36 +02:00
Alexey Kardashevskiy ba0e5bf8de spapr: Remove @next_irq
This removes @next_irq from sPAPREnvironment which was used in old
IRQ allocator as XICS is now responsible for IRQs and keeps track of
allocated IRQs.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy bee763dbfb spapr: Move interrupt allocator to xics
The current allocator returns IRQ numbers from a pool and does not
support IRQs reuse in any form as it did not keep track of what it
previously returned, it only keeps the last returned IRQ. Some use
cases such as PCI hot(un)plug may require IRQ release and reallocation.

This moves an allocator from SPAPR to XICS.

This switches IRQ users to use new API.

This uses LSI/MSI flags to know if interrupt is allocated.

The interrupt release function will be posted as a separate patch.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:26 +02:00
Alexey Kardashevskiy 6026db4501 spapr: Define a 2.1 pseries machine
This adds a v2.1 machine to support backward compatibility
for newer macines in the case if they ever be implemented.

This adds a "pseries-2.1" machine as a child of the "pseries"
machine and only changes visible machine name.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
Alexey Kardashevskiy 6ca1502e36 spapr: Fix code design style (s/SPAPRMachine/sPAPRMachineState)
Every single sPAPR QOM object has small first "s".
Most (not all yet) QOM objects have "State" suffix.

This replaces SPAPRMachine with sPAPRMachineState to conform with QEMU
code style and removes redundant empty line.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:25 +02:00
Avik Sil cc84c0f357 spapr: Add "qemu, boot-menu" property to /chosen
This is required to enable boot menu display during booting

Signed-off-by: Avik Sil <aviksil@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-27 13:48:22 +02:00
Wanlong Gao 8c85901ed3 NUMA: Add numa_info structure to contain numa nodes info
Add the numa_info structure to contain the numa nodes memory,
VCPUs information and the future added numa nodes host memory
policies.

Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
[Fix hw/ppc/spapr.c - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2014-06-19 18:44:18 +03:00
Badari Pulavarty 9dbae97723 spapr_pci: Advertise MSI quota
Hotplug of multiple disks fails due to MSI vector quota check.
Number of MSI vectors default to 8 allowing only 4 devices.
This happens on RHEL6.5 guest. RHEL7 and SLES11 guests fallback
to INTX.

One way to workaround the issue is to increase total MSIs,
so that MSI quota check allows us to hotplug multiple disks.

This sets the quota to the maximum number of interupts XICS has
which is 1024 now (XICS_IRQS). This moves XICS_IRQS from spapr.c
to xics.h for wider visibility.

Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com>
[aik: put XICS_IRQS=1024 instead of 64i, fixed endianness and size]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Eduardo Habkost 23825581d7 spapr: Add kvm-type property
The kvm-type machine option was left out when MachineState was
introduced, preventing the kvm-type option from being used. Add the
missing property to the sPAPR machine class, so it can be used.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Eduardo Habkost 748abce94f spapr: Create SPAPRMachine struct
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:46 +02:00
Alexander Graf f7d6914654 PPC: spapr: Expose /hypervisor node in device tree
PR KVM supports an ePAPR compliant hypercall interface in parallel to the
normal sPAPR one. Expose the ePAPR /hypervisor node and properties to the
guest so it can use it.

This enables magic page sharing on PR KVM with -M pseries.

However we had a few nasty bugs in the magic page implementation on vcpus
newer than 970 (p7, p8) that KVM now has workarounds for. It indicates that
it does have these workarounds through the PPC_FIXUP_HCALL capability.

To not expose broken guest kernels to issues on host kernels that don't
have the fixups in place, we don't expose working hypercall instructions
when the fixups are not available so that the guest can never active the
magic page.

Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:41 +02:00
Alexey Kardashevskiy da95324ebe spapr_iommu: Enable multiple TCE requests
Currently only single TCE entry per request is supported (H_PUT_TCE).
However PAPR+ specification allows multiple entry requests such as
H_PUT_TCE_INDIRECT and H_STUFF_TCE. Having less transitions to the host
kernel via ioctls, support of these calls can accelerate IOMMU operations.

This implements H_STUFF_TCE and H_PUT_TCE_INDIRECT.

This advertises "multi-tce" capability to the guest if the host kernel
supports it (KVM_CAP_SPAPR_MULTITCE) or guest is running in TCG mode.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:39 +02:00
Alexey Kardashevskiy a1d59c0ffa spapr: Enable dynamic change of the supported hypercalls list
At the moment the "ibm,hypertas-functions" list is fixed. However some
calls should be listed there if they are supported by QEMU or the host
kernel.

This enables hyperrtas_prop to grow on stack by adding
a SPAPR_HYPERRTAS_ADD macro. "qemu,hypertas-functions" is converted as well.

The first user of this is going to be a "multi-tce" property.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-06-16 13:24:38 +02:00