Catch writes to the MSI MMIO region in the KVM APIC and forward them to
the kernel. Provide the kernel support GSI routing, this allows to
enable MSI support also for in-kernel irqchip mode.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Push msi_supported enabling to the APIC implementations where we can
encapsulate the decision more cleanly, hiding the details from the
generic code.
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
This patch basically adds kvm_irqchip_send_msi, a service for sending
arbitrary MSI messages to KVM's in-kernel irqchip models.
As the original KVM API requires us to establish a static route from a
pseudo GSI to the target MSI message and inject the MSI via toggling
that virtual IRQ, we need to play some tricks to make this interface
transparent. We create those routes on demand and keep them in a hash
table. Succeeding messages can then search for an existing route in the
table first and reuse it whenever possible. If we should run out of
limited GSIs, we simply flush the table and rebuild it as messages are
sent.
This approach is rather simple and could be optimized further. However,
latest kernels contains a more efficient MSI injection interface that
will obsolete the GSI-based dynamic injection.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Will be used for generating and distributing MSI messages, both in
emulation mode and under KVM.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Instead of the bitmap size, store the maximum of GSIs the kernel
support. Move the GSI limit assertion to the API function
kvm_irqchip_add_route and make it stricter.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
If the kernel page size is larger than TARGET_PAGE_SIZE, which
happens for example on ppc64 with kernels compiled for 64K pages,
the dirty tracking doesn't work.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Avi Kivity <avi@redhat.com>
* kwolf/for-anthony:
ATA: Allow WIN_SECURITY_FREEZE_LOCK as nop
rbd: add discard support
qcow2: fix the return value -ENOENT -> -EEXIST
qcow2: Don't hold cache references across yield
qcow2: Remove unused parameter in do_alloc_cluster_offset
qemu-iotests: Many parallel allocating I/O requests
docs: fix one issue in qcow2 specs
block/qcow2: Add missing GCC_FMT_ATTR to function report_unsupported()
qemu-iotests: ignore fragmentation information for qed
When using Windows 8 with an AHCI disk drive, it issues a blue screen.
The reason is that WIN_SECURITY_FREEZE_LOCK / CFA_WEAR_LEVEL is not
supported by our ATA implementation, but Windows expects it to be there.
Since without security stuff implemented, the lock would be a nop anyway
and CFA_WEAR_LEVEL already is treated as a nop, let's just allow the cmd
for HD drives as well. That way Windows is happy.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Change the write flag to an operation type in RBDAIOCB, and make the
buffer optional since discard doesn't use it.
Discard is first included in librbd 0.1.2 (which is in Ceph 0.46).
If librbd is too old, leave out qemu_rbd_aio_discard entirely,
so the old behavior is preserved.
Signed-off-by: Josh Durgin <josh.durgin@dreamhost.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Zhi Yong Wu <wuzhy@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If cache references are held while the coroutine has yielded, the cache
may get used up and abort() when it can't find a free entry.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This test case manages to let qcow2 abort because its cache is used up
and it can't find free cache entries for new requests any more.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We added image fragmentation statistics functions to qemu-img several days
ago, those patches will cause "./check -qed" failed. This patch will ignore
fragmentation statistics information of qed format, and then "./check -qed"
will work.
Signed-off-by: Dong Xu Wang <wdongxu@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* agraf/s390-for-upstream:
s390: reset avail and used index on reboot
S390: dont call system_shutdown on disabled wait
S390: remove default cdrom, sd-card and floppy support
S390: support reboot for kvm on s390
S390: reboot: reset device pages on reboot
S390: fix error handling on kernel and initrd failures
S390: fix kernel_commandline handling
* stefanha/trivial-patches:
iohandler: Use bool for boolean struct member and remove holes
async: Use bool for boolean struct members and remove a hole
configure: Fix creation of symbolic links for MinGW toolchain
* agraf/ppc-for-upstream:
linux-user: Fix invalid TARGET_ABI_BITS usage on ppc hosts
target-ppc: Some support for dumping TLB_EMB TLBs
ppce500_spin: Replace assert by hw_error (fixes compiler warning)
pseries: Fix use of global CPU state
pseries: Use the same interrupt swizzling for host bridges as p2p bridges
pseries: Implement automatic PAPR VIO address allocation
PPC: Fix up e500 cache size setting
booke:Use MMU API for creating initial mapping for secondary cpus
* mdroth/qga-pull-4-27-12:
qemu-ga: persist tracking of fsfreeze state via filesystem
qemu-ga: add a whitelist for fsfreeze-safe commands
qemu-ga: improve recovery options for fsfreeze
The smb.conf generated by the userspace networking does not include a state directory
directive. Samba therefore falls back to the default value. Since the user generally
does not have write access to this path, smbd immediately crashes.
The "state directory" option was added in Samba 3.4.0 (commit
http://gitweb.samba.org/?p=samba.git;a=commit;h=7b02e05eb64f3ffd7aa1cf027d10a7343c0da757).
This patch adds the missing option.
Signed-off-by: Nikolaus Rath <Nikolaus@rath.org>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
The "smb ports = 0" option causes recent samba versions to crash. It was
introduced in commit 157777ef3e with log message "Samba 3 support".
However, a value of 0 has never been officially supported by smb and is
also not necessary: if stdin is a socket, smb does not try to listen on
any ports and uses just stdin. This is necessary to support inetd based
operation (otherwise smbd would always fail when called from inetd,
because inetd already listens on the SMB port). Since samba has
supported inetd operation since pre-3.x, it should be safe to rely on
this feature. I have tested it with Samba 3.6.4 -- communication works
fine, and smbd is not listening on any ports.
I suspect the "smb ports = 0" hack may have been introduced when someone
tested the qemu generated samba config from the command line with "smbd
-i" and found it to fail (because then stdin isn't a socket).
Signed-off-by: Nikolaus Rath <Nikolaus@rath.org>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
When trying to evaluate the size of the _host_ type size for olddev_t,
we need to expose the host's pointer size, not the guest pointer size.
This usage got introduced accidently in commit b754e4fc1.
Fix things by not using TARGET_.*, but rather use host sizeof()
information, which gives us the correct size.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
The default case in function spin_read should never be reached,
therefore the old code used assert(0) to abort QEMU.
This does not work when QEMU is compiled with macro NDEBUG defined.
In this case (and also when the compiler does not know that assert
never returns), there is a compiler warning because of the missing
return value.
Using hw_error allows an improved error message and aborts always.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
[agraf: use __func__]
Signed-off-by: Alexander Graf <agraf@suse.de>
Commit ed120055c7 (Implement PAPR VPA
functions for pSeries shared processor partitions) introduced the
deregister_dtl() function and typo "emv" as name of its argument.
This went unnoticed because the code in that function can access the
global variable "env" so that no build failure resulted.
Fix the argument to read "env". Resolves LP#986241.
Signed-off-by: Peter Portante <peter.portante@redhat.com>
Acked-by: Andreas Färber <afaerber@suse.de>
[agraf: fixed typo in commit message]
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently the pseries PCI code uses a somewhat strange scheme of PCI irq
allocation - one per slot up to a maximum that's greater than the usual 4.
This scheme more or less worked, because we were able to tell the guest the
irq mapping in the device tree, however it's a bit odd and may break
assumptions in the future. Worse, the array used to construct the dev
tree interrupt map was mis-sized, we got away with it only because it
happened that our SPAPR_PCI_NUM_LSI value was greater than 7.
This patch changes the pseries PCI code to use the same interrupt swizzling
scheme as is standardized for PCI to PCI bridges. This makes for better
consistency, deals better with any devices which use multiple interrupt
pins and will make life easier in the future when we add passthrough of
what may be either a host bridge or a PCI to PCI bridge. This won't break
existing guests, because they don't assume a particular mapping scheme for
host bridges, but just follow what we tell them in the device tree (also
updated to match, of course). This patch also fixes the allocation of the
irq map.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
PAPR virtual IO (VIO) devices require a unique, but otherwise arbitrary,
"address" used as a token to the hypercalls which manipulate them.
Currently the pseries machine code does an ok job of allocating these
addresses when the legacy -net nic / -serial and so forth options are used
but will fail to allocate them properly when using -device.
Specifically, you can use -device if all addresses are explicitly assigned.
Without explicit assignment, only one VIO device of each type (network,
console, SCSI) will be assigned properly, any further ones will attempt
to take the same address leading to a fatal error.
This patch fixes the situation by adding a proper address allocator to the
VIO "bus" code. This is used both by -device and the legacy options and
default devices. Addresses can still be explicitly assigned with -device
options if desired.
This patch changes the (guest visible) numbering of VIO devices, but since
their addresses are discovered using the device tree and already differ
from the numbering found on existing PowerVM systems, this does not break
compatibility.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
When initializing the e500 code, we need to expose its
cache line size for user and system mode, while the mmu
details are only interesting for system emulation.
Split the 2 switch statements apart, allowing us to #ifdef
out the mmu parts for user mode emulation while keeping all
cache information consistent.
Signed-off-by: Alexander Graf <agraf@suse.de>
Initial Mapping creation for secondary CPU in SMP was missing new MMU API.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
On my PPC host, HOST_LONG_SIZE is not defined even after
running configure. Use the normal C way of determining the
long size instead.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: malc <av1474@comtv.ru>
The tracetool code requires Python 2.4, which was released in 2004.
Check for a supported Python version so we can give a clear error
message.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Lluís Vilanova <vilanova@ac.upc.edu>
The pkgutil.iter_modules() function provides a way to enumerate child
modules. Unfortunately it's missing in Python <2.7 so we must implement
similar behavior ourselves.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Lluís Vilanova <vilanova@ac.upc.edu>
The str.rpartition() function is related to str.split() and is used for
splitting strings. It was introduced in Python 2.5 and therefore cannot
be used in tracetool as Python 2.4 compatibility is required.
Replace the code using str.rsplit().
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Lluís Vilanova <vilanova@ac.upc.edu>
In Python 2.5 keyword arguments were added to __import__(). Avoid using
them to achieve Python 2.4 compatibility.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Lluís Vilanova <vilanova@ac.upc.edu>
The newer "except <exception-type> as <exception>:" syntax is not
supported by Python 2.4, we need to use "except <exception-type>,
<exception>:".
Tested all trace backends with Python 2.4.
Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Lluís Vilanova <vilanova@ac.upc.edu>
reset the guest vring avail/used idx fields, otherwise it's possible
that old values remain in memory which would cause a reboot to fail
with a "Guest moved used index" message
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
A disabled wait usually indicates a guest problem. Dont shutdown the
guest to allow guest dumping.
Have some special cases, e.g. a quiesce disabled wait. In that case
we want to shutdown.
Long term solution might be a crashed/panic indication.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch simply disables CDROM, SD card and floppy support for the
s390 virtio machine. Without this patch, a default CDROM drive would
get added which has currently no backing on s390.
Signed-off-by: Einar Lueck <elelueck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch adds reboot support for s390x-softmmu by calling
the generic reboot support in kvm.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch fixes reboot on s390 by resetting the device
page on reboot.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
If the user specifies a non-existing or non-accessable kernel or initrd
qemu does not fail, instead it ipls into the system, which then falls
into a program check loop due to the zeroed memory with no kernel.
Lets add some sanity checks.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>