qemu-e2k

Commit Graph

Author	SHA1	Message	Date
Benjamin Herrenschmidt	cd0c6f4735	ppc: Do some batching of TCG tlb flushes On ppc64 especially, we flush the tlb on any slbie or tlbie instruction. However, those instructions often come in bursts of 3 or more (context switch will favor a series of slbie's for example to an slbia if the SLB has less than a certain number of entries in it, and tlbie's can happen in a series, with PAPR, H_BULK_REMOVE can remove up to 4 entries at a time. Doing a tlb_flush() each time is a waste of time. We end up doing a memset of the whole TLB, reloading it for the next instruction, memset'ing again, etc... Those instructions don't have to take effect immediately. For slbie, they can wait for the next context synchronizing event. For tlbie, the next tlbsync. This implements batching by keeping a flag that indicates that we have a TLB in need of flushing. We check it on interrupts, rfi's, isync's and tlbsync and flush the TLB if needed. This reduces the number of tlb_flush() on a boot to a ubuntu installer first dialog screen from roughly 360K down to 36K. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [clg: added a 'CPUPPCState *' variable in h_remove() and h_bulk_remove() ] Signed-off-by: Cédric Le Goater <clg@kaod.org> [dwg: removed spurious whitespace change, use 0/1 not true/false consistently, since tlb_need_flush has int type] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-05-30 13:20:04 +10:00
Alexey Kardashevskiy	fec5d3a1cd	spapr_iommu: Move table allocation to helpers At the moment presence of vfio-pci devices on a bus affect the way the guest view table is allocated. If there is no vfio-pci on a PHB and the host kernel supports KVM acceleration of H_PUT_TCE, a table is allocated in KVM. However, if there is vfio-pci and we do yet not KVM acceleration for these, the table has to be allocated by the userspace. At the moment the table is allocated once at boot time but next patches will reallocate it. This moves kvmppc_create_spapr_tce/g_malloc0 and their counterparts to helpers. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-05-27 09:40:23 +10:00
Alexey Kardashevskiy	eded5bac3b	spapr_pci: Use correct DMA LIOBN when composing the device tree The user could have picked LIOBN via the CLI but the device tree rendering code would still use the value derived from the PHB index (which is the default fallback if LIOBN is not set in the CLI). This replaces SPAPR_PCI_LIOBN() with the actual DMA LIOBN value. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-05-27 09:40:23 +10:00
Jianjun Duan	5dd5238c0b	spapr: ensure device trees are always associated with DRC There are possible racing situations involving hotplug events and guest migration. For cases where a hotplug event is migrated, or the guest is in the process of fetching device tree at the time of migration, we need to ensure the device tree is created and associated with the corresponding DRC for devices that were hotplugged on the source, but 'coldplugged' on the target. Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-05-27 09:40:23 +10:00
Zhou Jie	8afc22a20f	Added negative check for get_image_size() This patch adds check for negative return value from get_image_size(), where it is missing. It avoids unnecessary two function calls. Signed-off-by: Zhou Jie <zhoujie2011@cn.fujitsu.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-05-27 09:40:23 +10:00
Alexey Kardashevskiy	d78c19b5cf	memory: Fix IOMMU replay base address Since `a788f227` "memory: Allow replay of IOMMU mapping notifications" when new VFIO listener is added, all existing IOMMU mappings are replayed. However there is a problem that the base address of an IOMMU memory region (IOMMU MR) is ignored which is not a problem for the existing user (which is pseries) with its default 32bit DMA window starting at 0 but it is if there is another DMA window. This stores the IOMMU's offset_within_address_space and adjusts the IOVA before calling vfio_dma_map/vfio_dma_unmap. As the IOMMU notifier expects IOVA offset rather than the absolute address, this also adjusts IOVA in sPAPR H_PUT_TCE handler before calling notifier(s). Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2016-05-26 11:12:08 -06:00
Igor Mammedov	bacc344c54	machine: add properties to compat_props incrementaly Switch to adding compat properties incrementaly instead of completly overwriting compat_props per machine type. That removes data duplication which we have due to nested [PC\|SPAPR]_COMPAT_* macros. It also allows to set default device properties from default foo_machine_options() hook, which will be used in following patch for putting VMGENID device as a function if ISA bridge on pc/q35 machines. Suggested-by: Eduardo Habkost <ehabkost@redhat.com> Signed-off-by: Igor Mammedov <imammedo@redhat.com> Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> Reviewed-by: Eduardo Habkost <ehabkost@redhat.com> [ehabkost: Fixed CCW_COMPAT_* and PC_COMPAT_0_* defines] Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2016-05-20 14:28:54 -03:00
Paolo Bonzini	63c915526d	cpu: move exec-all.h inclusion out of cpu.h exec-all.h contains TCG-specific definitions. It is not needed outside TCG-specific files such as translate.c, exec.c or *helper.c. One generic function had snuck into include/exec/exec-all.h; move it to include/qom/cpu.h. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-19 16:42:29 +02:00
Paolo Bonzini	03dd024ff5	hw: explicitly include qemu/log.h Move the inclusion out of hw/hw.h, most files do not need it. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-19 16:42:29 +02:00
Paolo Bonzini	33c11879fd	qemu-common: push cpu.h inclusion out of qemu-common.h Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-19 16:42:29 +02:00
Paolo Bonzini	77ac58ddc6	dma: do not depend on kvm_enabled() Memory barriers are needed also by Xen and, when the ioeventfd bugs are fixed, by TCG as well. sysemu/kvm.h is not anymore needed in sysemu/dma.h, move it to the actual users. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-19 16:42:28 +02:00
Paolo Bonzini	cbd62f8616	hw: do not use VMSTATE_*TL Reserve this to CPU state serialization. Luckily, they were only used by sPAPR devices and these are ppc64 only. So there is no change to migration format. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-19 16:42:28 +02:00
Paolo Bonzini	aa5a9e2484	ppc: use PowerPCCPU instead of CPUPPCState This changes a cpu.h dependency for hw/ppc/ppc.h into a cpu-qom.h dependency. For it to compile we also need to clean up a few unused definitions. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-05-19 16:42:27 +02:00
Eric Blake	d9f62dde13	qapi: Simplify semantics of visit_next_list() The semantics of the list visit are somewhat baroque, with the following pseudocode when FooList is used: start() for (prev = head; cur = next(prev); prev = &cur) { visit(&cur->value) } Note that these semantics (advance before visit) requires that the first call to next() return the list head, while all other calls return the next element of the list; that is, every visitor implementation is required to track extra state to decide whether to return the input as-is, or to advance. It also requires an argument of 'GenericList *' to next(), solely because the first iteration might need to modify the caller's GenericList head, so that all other calls have to do a layer of dereferencing. Thankfully, we only have two uses of list visits in the entire code base: one in spapr_drc (which completely avoids visit_next_list(), feeding in integers from a different source than uint8List), and one in qapi-visit.py. That is, all other list visitors are generated in qapi-visit.c, and share the same paradigm based on a qapi FooList type, so we can refactor how lists are laid out with minimal churn among clients. We can greatly simplify things by hoisting the special case into the start() routine, and flipping the order in the loop to visit before advance: start(head) for (tail = head; tail; tail = next(tail)) { visit(&tail->value) } With the simpler semantics, visitors have less state to track, the argument to next() is reduced to 'GenericList *', and it also becomes obvious whether an input visitor is allocating a FooList during visit_start_list() (rather than the old way of not knowing if an allocation happened until the first visit_next_list()). As a minor drawback, we now allocate in two functions instead of one, and have to pass the size to both functions (unless we were to tweak the input visitors to cache the size to start_list for reuse during next_list, but that defeats the goal of less visitor state). The signature of visit_start_list() is chosen to match visit_start_struct(), with the new parameters after 'name'. The spapr_drc case is a virtual visit, done by passing NULL for list, similarly to how NULL is passed to visit_start_struct() when a qapi type is not used in those visits. It was easy to provide these semantics for qmp-output and dealloc visitors, and a bit harder for qmp-input (several prerequisite patches refactored things to make this patch straightforward). But it turned out that the string and opts visitors munge enough other state during visit_next_list() to make it easier to just document and require a GenericList visit for now; an assertion will remind us to adjust things if we need the semantics in the future. Several pre-requisite cleanup patches made the reshuffling of the various visitors easier; particularly the qmp input visitor. Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1461879932-9020-24-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-05-12 09:47:55 +02:00
Eric Blake	15c2f669e3	qapi: Split visit_end_struct() into pieces As mentioned in previous patches, we want to call visit_end_struct() functions unconditionally, so that visitors can release resources tied up since the matching visit_start_struct() without also having to worry about error priority if more than one error occurs. Even though error_propagate() can be safely used to ignore a second error during cleanup caused by a first error, it is simpler if the cleanup cannot set an error. So, split out the error checking portion (basically, input visitors checking for unvisited keys) into a new function visit_check_struct(), which can be safely skipped if any earlier errors are encountered, and leave the cleanup portion (which never fails, but must be called unconditionally if visit_start_struct() succeeded) in visit_end_struct(). Generated code in qapi-visit.c has diffs resembling: \|@@ -59,10 +59,12 @@ void visit_type_ACPIOSTInfo(Visitor *v, \| goto out_obj; \| } \| visit_type_ACPIOSTInfo_members(v, obj, &err); \|- error_propagate(errp, err); \|- err = NULL; \|+ if (err) { \|+ goto out_obj; \|+ } \|+ visit_check_struct(v, &err); \| out_obj: \|- visit_end_struct(v, &err); \|+ visit_end_struct(v); \| out: and in qapi-event.c: @@ -47,7 +47,10 @@ void qapi_event_send_acpi_device_ost(ACP \| goto out; \| } \| visit_type_q_obj_ACPI_DEVICE_OST_arg_members(v, &param, &err); \|- visit_end_struct(v, err ? NULL : &err); \|+ if (!err) { \|+ visit_check_struct(v, &err); \|+ } \|+ visit_end_struct(v); \| if (err) { \| goto out; Signed-off-by: Eric Blake <eblake@redhat.com> Message-Id: <1461879932-9020-20-git-send-email-eblake@redhat.com> [Conflict with a doc fixup resolved] Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-05-12 09:47:55 +02:00
Eric Blake	a543a554cf	spapr_drc: Expose 'null' in qom-get when there is no fdt Now that the QMP output visitor supports an explicit null output, we should utilize it to make it easier to diagnose the difference between a missing fdt ('null') vs. a present-but-empty one ('{}'). (Note that this reverts the behavior of commit `ab8bf1d`, taking us back to the behavior of commit `6c2f9a1` [which in turn stemmed from a crash fix in `1d10b44`]; but that this time, the change is intentional and not an accidental side-effect.) Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Message-Id: <1461879932-9020-17-git-send-email-eblake@redhat.com> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2016-05-12 09:47:54 +02:00
Michael Roth	df18b2db69	spapr_drc: fix aborts during DRC-count based hotplug CPU/memory resources can be signalled en-masse via spapr_hotplug_req_add_by_count(), and when doing so, actually change the meaning of the 'drc' parameter passed to spapr_hotplug_req_event() to be a count rather than an index. `f40eb92` added a hook in spapr_hotplug_req_event() to record when a device had been 'signalled' to the guest, but that code assumes that drc is always an index. In cases where it's a count, such as memory hotplug, the DRC lookup will fail, leading to an assert. Fix this by only explicitly setting the signalled state for cases where we are doing PCI hotplug. For other resources types, since we cannot selectively track whether a resource has been signalled in cases where we signal attach as a count, set the 'signalled' state to true immediately upon making the resource available via drck->attach(). Reported-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Cc: Bharata B Rao <bharata@linux.vnet.ibm.com> Cc: david@gibson.dropbear.id.au Cc: qemu-ppc@nongnu.org Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-04-26 11:16:08 +10:00
Thomas Huth	da34fed707	hw/ppc/spapr: Fix crash when specifying bad parameters to spapr-pci-host-bridge QEMU currently crashes when using bad parameters for the spapr-pci-host-bridge device: $ qemu-system-ppc64 -device spapr-pci-host-bridge,buid=0x123,liobn=0x321,mem_win_addr=0x1,io_win_addr=0x10 Segmentation fault The problem is that spapr_tce_find_by_liobn() might return NULL, but the code in spapr_populate_pci_dt() does not check for this condition and then tries to dereference this NULL pointer. Apart from that, the return value of spapr_populate_pci_dt() also has to be checked for all PCI buses, not only for the last one, to make sure we catch all errors. Signed-off-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-04-23 16:52:20 +10:00
Peter Maydell	3be4f4d724	ppc patch queue for 2016-04-08 Just a single bugfix for spapr in this batch, but I want to make sure it gets in for 2.6. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXBzt1AAoJEGw4ysog2bOSyGQQAIL4aADwOhNoVLjtvBN3eoPQ cP+Ps3DCK/9Z9l00cMR6/8zk5Q2Nb1FLf2Y3f3c2JVEFER8XCnsYPIyYfOZaMex4 /8DUfVueTh0RmpxhWwA4vQJtDqrilB0tUkkqgWFPE2luJcTVTUU7mig788d2yrmp J35ncNaMcrXGy0Uh/wBlnOpfHD17ds8Sgpw02TT9QusqIjq8MWIkgat0v+h4RmRL lzEE5N1Vp8vOvJENTEnuuKFbFTxcvhBS+A2K1y+s10k7c1CuFFJpAZY7g3T4hpqU NZAirty5WeMlSYk9A0gQhgHWq2XSgbDWWj6tMGd5sCEQH5D6Kty0TPWnCpzSxjgu aqGr7BqAV+NV/Rr/jGy4gvE432f1pZWUIxq271OH9H5aniCWSYFBR7w4UEaM1BPQ I5tzkp7P1PMWIm/K5ryFVo083kU08KFXZDSbQR/vu4O+DuohPUKYid5cv4wJj/W+ GSzBwTwtp8iY2rs/nbMptSYHKYFYtd5PuALf4BoK62sF72NtWq+41X3QV8I4cIQd hM03NyuObgnY7aygPmo9OGsvW/Dx8DKKoEO0QX+2gFa22rJ+j7RLSu7pHFW1JEXa 5VkVlTtN8L5NeeG0PdkgkChcgiqahUA6bRjekpFzdoncfsmmiPkiP5xQqK1DVKhW SoJacddcj86QGpT1aioU =4ZAr -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160408' into staging ppc patch queue for 2016-04-08 Just a single bugfix for spapr in this batch, but I want to make sure it gets in for 2.6. # gpg: Signature made Fri 08 Apr 2016 06:02:45 BST using RSA key ID 20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.6-20160408: spapr: Fix ibm,lrdr-capacity Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-04-08 11:54:19 +01:00
Bharata B Rao	a110655a06	spapr: Fix ibm,lrdr-capacity ibm,lrdr-capacity has a field to describe the maximum address in bytes and therefore, the most memory that can be allocated to this guest. We are using maxmem for this field, but instead should use the actual RAM address corresponding to the end of hotplug region. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-04-08 11:18:10 +10:00
Gonglei	1a5512bb7e	spapr: fix possible Negative array index read fix CID 1351391. Signed-off-by: Gonglei <arei.gonglei@huawei.com> Message-Id: <1456998223-12356-6-git-send-email-arei.gonglei@huawei.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-04-08 00:07:56 +02:00
Michael Roth	f40eb921da	spapr_drc: enable immediate detach for unsignalled devices Currently spapr doesn't support "aborting" hotplug of PCI devices by allowing device_del to immediately remove the device if we haven't signalled the presence of the device to the guest. In the past this wasn't an issue, since we always immediately signalled device attach and simply relied on full guest-aware add->remove path for device removal. However, as of `788d259`, we now defer signalling for PCI functions until function 0 is attached, so now we need to deal with these "abort" operations for cases where a user hotplugs a non-0 function, then opts to remove it prior hotplugging function 0. Currently they'd have to reboot before the unplug completed. PCIe multifunction hotplug does not have this requirement however, so from a management implementation perspective it would be good to address this within the same release as `788d259`. We accomplish this by simply adding a 'signalled' flag to track whether a device hotplug event has been sent to the guest. If it hasn't, we allow immediate removal under the assumption that the guest will not be using the device. Devices present at boot/reset time are also assumed to be 'signalled'. For CPU/memory/etc, signalling will still happen immediately as part of device_add, so only PCI functions should be affected. Cc: bharata@linux.vnet.ibm.com Cc: david@gibson.dropbear.id.au Cc: sbhat@linux.vnet.ibm.com Cc: qemu-ppc@nongnu.org Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> [dwg: This fixes a regression where an incorrect hot-add of a non-zero function can no longer be backed out until function 0 is added] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-04-05 10:47:03 +10:00
Cédric Le Goater	5c94b2a5e5	ppc: Rework POWER7 & POWER8 exception model From: Benjamin Herrenschmidt <benh@kernel.crashing.org> This patch fixes the current AIL implementation for POWER8. The interrupt vector address can be calculated directly from LPCR when the exception is handled. The excp_prefix update becomes useless and we can cleanup the H_SET_MODE hcall. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [clg: Removed LPES0/1 handling for HV vs. !HV Fixed LPCR_ILE case for POWERPC_EXCP_POWER8 ] Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> [dwg: This was written as a cleanup, but it also fixes a real bug where setting an alternative interrupt location would not be correctly migrated] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-04-05 10:38:24 +10:00
Peter Maydell	84a5a80148	* Log filtering from Alex and Peter * Chardev fix from Marc-André * config.status tweak from David * Header file tweaks from Markus, myself and Veronia (Outreachy candidate) * get_ticks_per_sec() removal from Rutuja (Outreachy candidate) * Coverity fix from myself * PKE implementation from myself, based on rth's XSAVE support -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQEcBAABCAAGBQJW9ErPAAoJEL/70l94x66DJfEH/A/QkMpAhrgNdyVsahzsGrzE wx5gHFIc1nBYxyr62w4apUb5jPB7zaXu0LA7EAWDeAe0pyP8hZzLT9kJyOEDsuJu zwKN2QeLSNMtPbnbKN0I/YQ2za2xX1V5ruhSeOJoVslUI214hgnAURaGshhQNzuZ 2CluDT9KgL5cQifAnKs5kJrwhIYShYNQB+1eDC/7wk28dd/EH+sPALIoF+rqrSmt Zu4Mdqd+9Ns+oKOjA6br9ULq/Hzg0aDfY82J+XLVVqfF3PXQe8rTDmuMf/7jTn+M Un7ZOcei9oZF2/9vfAfKQpDCcgD9HvOUSbgqV/ubmkPPmN/LNJzeKj0fBhrRN+Y= =K12D -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging * Log filtering from Alex and Peter * Chardev fix from Marc-André * config.status tweak from David * Header file tweaks from Markus, myself and Veronia (Outreachy candidate) * get_ticks_per_sec() removal from Rutuja (Outreachy candidate) * Coverity fix from myself * PKE implementation from myself, based on rth's XSAVE support # gpg: Signature made Thu 24 Mar 2016 20:15:11 GMT using RSA key ID 78C7AE83 # gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" # gpg: aka "Paolo Bonzini <pbonzini@redhat.com>" * remotes/bonzini/tags/for-upstream: (28 commits) target-i386: implement PKE for TCG config.status: Pass extra parameters char: translate from QIOChannel error to errno exec: fix error handling in file_ram_alloc cputlb: modernise the debug support qemu-log: support simple pid substitution for logs target-arm: dfilter support for in_asm qemu-log: dfilter-ise exec, out_asm, op and opt_op qemu-log: new option -dfilter to limit output qemu-log: Improve the "exec" TB execution logging qemu-log: Avoid function call for disabled qemu_log_mask logging qemu-log: correct help text for -d cpu tcg: pass down TranslationBlock to tcg_code_gen util: move declarations out of qemu-common.h Replaced get_tick_per_sec() by NANOSECONDS_PER_SECOND hw: explicitly include qemu-common.h and cpu.h include/crypto: Include qapi-types.h or qemu/bswap.h instead of qemu-common.h isa: Move DMA_transfer_handler from qemu-common.h to hw/isa/isa.h Move ParallelIOArg from qemu-common.h to sysemu/char.h Move QEMU_ALIGN_*() from qemu-common.h to qemu/osdep.h ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Conflicts: scripts/clean-includes	2016-03-24 21:42:40 +00:00
Thomas Huth	57c522f47b	hw/net/spapr_llan: Enable the RX buffer pools by default for new machines RX buffer pools are now enabled by default for new machine types. For older machine types, they are still disabled to avoid breaking migration. Signed-off-by: Thomas Huth <thuth@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-03-24 11:17:34 +11:00
Benjamin Herrenschmidt	26a7f1291b	ppc: Create cpu_ppc_set_papr() helper And move the code adjusting the MSR mask and calling kvmppc_set_papr() to it. This allows us to add a few more things such as disabling setting of MSR:HV and appropriate LPCR bits which will be used when fixing the exception model. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [clg: removed LPCR setting ] Signed-off-by: Cédric Le Goater <clg@fr.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-03-24 11:17:34 +11:00
Alexey Kardashevskiy	0ddbd05362	spapr/target-ppc/kvm: Only add hcall-instructions if KVM supports it ePAPR defines "hcall-instructions" device-tree property which contains code to call hypercalls in ePAPR paravirtualized guests. In general pseries guests won't use this property, instead using the PAPR defined hypercall interface. However, this property has been re-used to implement a hack to allow PR KVM to run (slightly modified) guests in some situations where it otherwise wouldn't be able to (because the system's L0 hypervisor doesn't forward the PAPR hypercalls to the PR KVM kernel). Hence, this property is always present in the device tree for pseries guests. All KVM guests use it at least to read features via the KVM_HC_FEATURES hypercall. The property is populated by the code returned from the KVM's KVM_PPC_GET_PVINFO ioctl; if not implemented in the KVM, QEMU supplies code which will fail all hypercall attempts. If QEMU does not create the property, and the guest kernel is compiled with CONFIG_EPAPR_PARAVIRT (which is normally the case), there is exactly the same stub at @epapr_hypercall_start already. Rather than maintaining this fairly useless stub implementation, it makes more sense not to create the property in the device tree in the first place if the host kernel does not implement it. This changes kvmppc_get_hypercall() to return 1 if the host kernel does not implement KVM_CAP_PPC_GET_PVINFO. The caller can use it to decide on whether to create the property or not. This changes the pseries machine to not create the property if KVM does not implement KVM_PPC_GET_PVINFO. In practice this means that from now on the property will not be created if either HV KVM or TCG is used. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [reworded commit message for clarity --dwg] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-03-24 11:17:33 +11:00
Veronia Bahaa	f348b6d1a5	util: move declarations out of qemu-common.h Move declarations out of qemu-common.h for functions declared in utils/ files: e.g. include/qemu/path.h for utils/path.c. Move inline functions out of qemu-common.h and into new files (e.g. include/qemu/bcd.h) Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-22 22:20:17 +01:00
Rutuja Shah	73bcb24d93	Replaced get_tick_per_sec() by NANOSECONDS_PER_SECOND This patch replaces get_ticks_per_sec() calls with the macro NANOSECONDS_PER_SECOND. Also, as there are no callers, get_ticks_per_sec() is then removed. This replacement improves the readability and understandability of code. For example, timer_mod(fdctrl->result_timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + (get_ticks_per_sec() / 50)); NANOSECONDS_PER_SECOND makes it obvious that qemu_clock_get_ns matches the unit of the expression on the right side of the plus. Signed-off-by: Rutuja Shah <rutu.shah.26@gmail.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-22 22:20:17 +01:00
Paolo Bonzini	4771d756f4	hw: explicitly include qemu-common.h and cpu.h Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-22 22:20:17 +01:00
Markus Armbruster	da34e65cb4	include/qemu/osdep.h: Don't include qapi/error.h Commit `57cb38b` included qapi/error.h into qemu/osdep.h to get the Error typedef. Since then, we've moved to include qemu/osdep.h everywhere. Its file comment explains: "To avoid getting into possible circular include dependencies, this file should not include any other QEMU headers, with the exceptions of config-host.h, compiler.h, os-posix.h and os-win32.h, all of which are doing a similar job to this file and are under similar constraints." qapi/error.h doesn't do a similar job, and it doesn't adhere to similar constraints: it includes qapi-types.h. That's in excess of 100KiB of crap most .c files don't actually need. Add the typedef to qemu/typedefs.h, and include that instead of qapi/error.h. Include qapi/error.h in .c files that need it and don't get it now. Include qapi-types.h in qom/object.h for uint16List. Update scripts/clean-includes accordingly. Update it further to match reality: replace config.h by config-target.h, add sysemu/os-posix.h, sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h comment quoted above similarly. This reduces the number of objects depending on qapi/error.h from "all of them" to less than a third. Unfortunately, the number depending on qapi-types.h shrinks only a little. More work is needed for that one. Signed-off-by: Markus Armbruster <armbru@redhat.com> [Fix compilation without the spice devel packages. - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2016-03-22 22:20:15 +01:00
Eduardo Habkost	0e6aac87fd	machine: Use type_init() to register machine classes Change all machine_init() users that simply call type_register*() to use type_init(). Cc: Evgeny Voevodin <e.voevodin@samsung.com> Cc: Maksim Kozlov <m.kozlov@samsung.com> Cc: Igor Mitsyanko <i.mitsyanko@gmail.com> Cc: Dmitry Solodkiy <d.solodkiy@samsung.com> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Rob Herring <robh@kernel.org> Cc: Andrzej Zaborowski <balrogg@gmail.com> Cc: Michael Walle <michael@walle.cc> Cc: "Hervé Poussineau" <hpoussin@reactos.org> Cc: Aurelien Jarno <aurelien@aurel32.net> Cc: Leon Alrae <leon.alrae@imgtec.com> Cc: Alexander Graf <agraf@suse.de> Cc: David Gibson <david@gibson.dropbear.id.au> Cc: Blue Swirl <blauwirbel@gmail.com> Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: "Michael S. Tsirkin" <mst@redhat.com> Acked-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>	2016-03-16 15:34:05 -03:00
David Gibson	a36304fdca	spapr_pci: Remove finish_realize hook Now that spapr-pci-vfio-host-bridge is reduced to just a stub, there is only one implementation of the finish_realize hook in sPAPRPHBClass. So, we can fold that implementation into its (single) caller, and remove the hook. That's the last thing left in sPAPRPHBClass, so that can go away as well. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2016-03-16 09:55:11 +11:00
David Gibson	72700d7e73	spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge Now that the regular spapr-pci-host-bridge can handle EEH, there are only two things that spapr-pci-vfio-host-bridge does differently: 1. automatically sizes its DMA window to match the host IOMMU 2. checks if the attached VFIO container is backed by the VFIO_SPAPR_TCE_IOMMU type on the host (1) is not particularly useful, since the default window used by the regular host bridge will work with the host IOMMU configuration on all current systems anyway. Plus, automatically changing guest visible configuration (such as the DMA window) based on host settings is generally a bad idea. It's not definitively broken, since spapr-pci-vfio-host-bridge is only supposed to support VFIO devices which can't be migrated anyway, but still. (2) is not really useful, because if a guest tries to configure EEH on a different host IOMMU, the first call will fail and that will be that. It's possible there are scripts or tools out there which expect spapr-pci-vfio-host-bridge, so we don't remove it entirely. This patch reduces it to just a stub for backwards compatibility. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2016-03-16 09:55:11 +11:00
David Gibson	c1fa017c7e	spapr_pci: Allow EEH on spapr-pci-host-bridge Now that the EEH code is independent of the special spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI host bridges instead. We do this by changing spapr_phb_eeh_available() to be based on the vfio_eeh_as_ok() call instead of the host bridge class. Because the value of vfio_eeh_as_ok() can change with devices being hotplugged or unplugged, this can potentially lead to some strange edge cases where the guest starts using EEH, then it starts failing because of a change in status. However, it's not really any worse than the current situation. Cases that would have worked previously will still work (i.e. VFIO devices from at most one VFIO IOMMU group per vPHB), it's just that it's no longer necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2016-03-16 09:55:11 +11:00
David Gibson	fbb4e98341	spapr_pci: Eliminate class callbacks The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the special groupid field in sPAPRPHBVFIOState. So we can simplify, removing the class specific callbacks with direct calls based on a simple spapr_phb_eeh_enabled() helper. For now we implement that in terms of a boolean in the class, but we'll continue to clean that up later. On its own this is a rather strange way of doing things, but it's a useful intermediate step to further cleanups. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2016-03-16 09:55:10 +11:00
David Gibson	76a9e9f680	spapr_pci: Switch to vfio_eeh_as_op() interface This switches all EEH on VFIO operations in spapr_pci_vfio.c from the broken vfio_container_ioctl() interface to the new vfio_as_eeh_op() interface. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>	2016-03-16 09:55:10 +11:00
Greg Kurz	f1a6cf3ef7	spapr_rng: fix race with main loop Since commit "60253ed1e6ec rng: add request queue support to rng-random", the use of a spapr_rng device may hang vCPU threads. The following path is taken without holding the lock to the main loop mutex: h_random() rng_backend_request_entropy() rng_random_request_entropy() qemu_set_fd_handler() The consequence is that entropy_available() may be called before the vCPU thread could even queue the request: depending on the scheduling, it may happen that entropy_available() does not call random_recv()->qemu_sem_post(). The vCPU thread will then sleep forever in h_random()->qemu_sem_wait(). This could not happen before `60253ed1e6` because entropy_available() used to call random_recv() unconditionally. This patch ensures the lock is held to avoid the race. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Reviewed-by: Cédric Le Goater <clg@fr.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-03-16 09:55:06 +11:00
David Gibson	c18ad9a54b	target-ppc: Eliminate kvmppc_kern_htab global `fa48b43` "target-ppc: Remove hack for ppc_hash64_load_hpte*() with HV KVM" purports to remove a hack in the handling of hash page tables (HPTs) managed by KVM instead of qemu. However, it actually went in the wrong direction. That patch requires anything looking for an external HPT (that is one not managed by the guest itself) to check both env->external_htab (for a qemu managed HPT) and kvmppc_kern_htab (for a KVM managed HPT). That's a problem because kvmppc_kern_htab is local to mmu-hash64.c, but some places which need to check for an external HPT are outside that, such as kvm_arch_get_registers(). The latter was subtly broken by the earlier patch such that gdbstub can no longer access memory. Basically a KVM managed HPT is much more like a qemu managed HPT than it is like a guest managed HPT, so the original "hack" was actually on the right track. This partially reverts `fa48b43`, so we again mark a KVM managed external HPT by putting a special but non-NULL value in env->external_htab. It then goes further, using that marker to eliminate the kvmppc_kern_htab global entirely. The ppc_hash64_set_external_hpt() helper function is extended to set that marker if passed a NULL value (if you're setting an external HPT, but don't have an actual HPT to set, the assumption is that it must be a KVM managed HPT). This also has some flow-on changes to the HPT access helpers, required by the above changes. Reported-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Thomas Huth <thuth@redhat.com> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>	2016-03-16 09:55:06 +11:00
David Gibson	e5c0d3ce40	target-ppc: Add helpers for updating a CPU's SDR1 and external HPT When a Power cpu with 64-bit hash MMU has it's hash page table (HPT) pointer updated by a write to the SDR1 register we need to update some derived variables. Likewise, when the cpu is configured for an external HPT (one not in the guest memory space) some derived variables need to be updated. Currently the logic for this is (partially) duplicated in ppc_store_sdr1() and in spapr_cpu_reset(). In future we're going to need it in some other places, so make some common helpers for this update. In addition the new ppc_hash64_set_external_hpt() helper also updates SDR1 in KVM - it's not updated by the normal runtime KVM <-> qemu CPU synchronization. In a sense this belongs logically in the ppc_hash64_set_sdr1() helper, but that is called from kvm_arch_get_registers() so can't itself call cpu_synchronize_state() without infinite recursion. In practice this doesn't matter because the only other caller is TCG specific. Currently there aren't situations where updating SDR1 at runtime in KVM matters, but there are going to be in future. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com>	2016-03-16 09:55:06 +11:00
Michael Roth	788d2599de	spapr_pci: fix multifunction hotplug Since `3f1e147`, QEMU has adopted a convention of supporting function hotplug by deferring hotplug events until func 0 is hotplugged. This is likely how management tools like libvirt would expose such support going forward. Since sPAPR guests rely on per-func events rather than slot-based, our protocol has been to hotplug func 0 first to avoid cases where devices appear within guests without func 0 present to avoid undefined behavior. To remain compatible with new convention, defer hotplug in a similar manner, but then generate events in 0-first order as we did in the past. Once func 0 present, fail any attempts to plug additional functions (as we do with PCIe). For unplug, defer unplug operations in a similar manner, but generate unplug events such that function 0 is removed last in guest. Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-03-16 09:55:05 +11:00
Michael S. Tsirkin	226419d615	msi_supported -> msi_nonbroken Rename controller flag to make it clearer what it means. Add some documentation as well. Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2016-03-11 16:45:21 +02:00
Peter Crosthwaite	7ef295ea5b	loader: Add data swap option to load-elf Some CPUs are of an opposite data-endianness to other components in the system. Sometimes elfs have the data sections layed out with this CPU data-endianness accounting for when loaded via the CPU, so byte swaps (relative to other system components) will occur. The leading example, is ARM's BE32 mode, which is is basically LE with address manipulation on half-word and byte accesses to access the hw/byte reversed address. This means that word data is invariant across LE and BE32. This also means that instructions are still LE. The expectation is that the elf will be loaded via the CPU in this endianness scheme, which means the data in the elf is reversed at compile time. As QEMU loads via the system memory directly, rather than the CPU, we need a mechanism to reverse elf data endianness to implement this possibility. Reviewed-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-03-04 11:30:21 +00:00
Greg Kurz	a005b3ef50	xics: report errors with the QEMU Error API Using the return value to report errors is error prone: - xics_alloc() returns -1 on error but spapr_vio_busdev_realize() errors on 0 - xics_alloc_block() returns the unclear value of ics->offset - 1 on error but both rtas_ibm_change_msi() and spapr_phb_realize() error on 0 This patch adds an errp argument to xics_alloc() and xics_alloc_block() to report errors. The return value of these functions is a valid IRQ number if errp is NULL. It is undefined otherwise. The corresponding error traces get promotted to error messages. Note that the "can't allocate IRQ" error message in spapr_vio_busdev_realize() also moves to xics_alloc(). Similar error message consolidation isn't really applicable to xics_alloc_block() because callers have extra context (device config address, MSI or MSIX). This fixes the issues mentioned above. Based on previous work from Brian W. Hart. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-02-28 16:19:02 +11:00
Greg Kurz	09b5e30da5	spapr: skip configuration section during migration of older machines Since QEMU 2.4, we have a configuration section in the migration stream. This must be skipped for older machines, like it is already done for x86. This patch fixes the migration of pseries-2.3 from/to QEMU 2.3, but it breaks migration of the same machine from/to QEMU 2.4/2.4.1/2.5. We do that anyway because QEMU 2.3 is likely to be more widely deployed than newer QEMU versions. Fixes: `61964c23e5` Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Reviewed-by: Laurent Vivier <lvivier@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-02-28 16:19:02 +11:00
Greg Kurz	cba0e7796b	spapr: disable vmdesc submission for old machines Since QEMU 2.3, we have a vmdesc section in the migration stream. This section is not mandatory but when migrating a pseries-2.2 machine from QEMU 2.2, you get a warning at the destination: qemu-system-ppc64: Expected vmdescription section, but got 0 The warning goes away if we decide to skip vmdesc as well for older pseries, like it is already done for pc's. This can only be observed with -cpu POWER7 because POWER8 cannot migrate from QEMU 2.2 to 2.3 (insns_flags2 mismatch). Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-02-28 16:19:02 +11:00
Greg Kurz	ce266b75fe	spapr_pci: fix irq leak in RTAS ibm,change-msi This RTAS call is used to request new interrupts or to free all interrupts. If the driver has already allocated interrupts and asks again for a non-null number of irqs, then the rtas_ibm_change_msi() function will silently leak the previous interrupts. It happens because xics_free() is only called when the driver releases all interrupts (!req_num case). Note that the previously allocated spapr_pci_msi is not leaked because the GHashTable is created with destroy functions and g_hash_table_insert() hence frees the old value. This patch makes sure any previously allocated MSIs are released when a new allocation succeeds. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-02-28 16:19:02 +11:00
Greg Kurz	d4a63ac8b1	spapr_pci: kill useless variable in rtas_ibm_change_msi() The num local variable is initialized to zero and has no writer. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-02-28 16:19:02 +11:00
Greg Kurz	3d0db3e74d	spapr_rng: disable hotpluggability It is currently possible to hotplug a spapr_rng device but QEMU crashes when we try to hot unplug: ERROR:hw/core/qdev.c:295:qdev_unplug: assertion failed: (hotplug_ctrl) Aborted This happens because spapr_rng isn't plugged to any bus and sPAPR does not provide hotplug support for it: qdev_get_hotplug_handler() hence return NULL and we hit the assertion. And anyway, it doesn't make much sense to unplug this device since hcalls cannot be unregistered. Even the idea of hotplugging a RNG device instead of declaring it on the QEMU command line looks weird. This patch simply disables hotpluggability for the spapr-rng class. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-02-28 16:19:02 +11:00
Greg Kurz	9897e46264	spapr: initialize local Error pointer This fixes a crash in the target QEMU during migration. Broken in commit `c5f54f3`. Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com> Reviewed-by: Markus Armbruster <armbru@redhat.com> [reworded commit message] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-02-25 13:58:44 +11:00

1 2 3 4 5 ...

698 Commits