The ioreq-server API added to Xen 4.5 offers better security than
the existing Xen/QEMU interface because the shared pages that are
used to pass emulation request/results back and forth are removed
from the guest's memory space before any requests are serviced.
This prevents the guest from mapping these pages (they are in a
well known location) and attempting to attack QEMU by synthesizing
its own request structures. Hence, this patch modifies configure
to detect whether the API is available, and adds the necessary
code to use the API if it is.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
A new common module is created. It implements all functions
that have no device specificity (PCI, Platform).
This patch only consists in move (no functional changes)
Signed-off-by: Kim Phillips <kim.phillips@linaro.org>
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
vfio_get_device now takes a VFIODevice as argument. The function is split
into 2 parts: vfio_get_device which is generic and vfio_populate_device
which is bus specific.
3 new fields are introduced in VFIODevice to store dev_info.
vfio_put_base_device is created.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This structure is going to be shared by VFIOPCIDevice and
VFIOPlatformDevice. VFIOBAR includes it.
vfio_eoi becomes an ops of VFIODevice specialized by parent device.
This makes possible to transform vfio_bar_write/read into generic
vfio_region_write/read that will be used by VFIOPlatformDevice too.
vfio_mmap_bar becomes vfio_map_region
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This patch removes all DPRINTF and replace them by trace points.
A few DPRINTF used in error cases were transformed into error_report.
Signed-off-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
MSI-X works slightly different than INTx; the doorbell
registers are not necessarily used as MSI-X interrupts
are directed anyway. So the head pointer on the
reply queue needs to be updated as soon as a frame
is completed, and we can set the doorbell only
when in INTx mode.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Windows requires the frames to be unmapped, otherwise we run
into a race condition where the updated frame data is not
visible to the guest.
With that we can simplify the queue algorithm and use a bitmap
for tracking free frames.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Improve queue logging by displaying head and tail pointer
of the completion queue.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The windows driver is sending several init_firmware commands
when in MSI-X mode. It is, however, using only the first
queue. So disregard any additional init_firmware commands
until the HBA is reset.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The EFI firmware doesn't handle unit attentions properly,
so we need to clear the Power On/Reset unit attention upon
initial reset.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
To ease debugging we should be decoding
the register names.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The trace events already contain the function name, so the actual
message doesn't need to contain any of these informations.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Device models should access their block backends only through the
block-backend.h API. Convert them, and drop direct includes of
inappropriate headers.
Just four uses of BlockDriverState are left:
* The Xen paravirtual block device backend (xen_disk.c) opens images
itself when set up via xenbus, bypassing blockdev.c. I figure it
should go through qmp_blockdev_add() instead.
* Device model "usb-storage" prompts for keys. No other device model
does, and this one probably shouldn't do it, either.
* ide_issue_trim_cb() uses bdrv_aio_discard() instead of
blk_aio_discard() because it fishes its backend out of a BlockAIOCB,
which has only the BlockDriverState.
* PC87312State has an unused BlockDriverState[] member.
The next two commits take care of the latter two.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Let QEMU propagate the cpu state to kvm. If kvm doesn't yet support it, it is
silently ignored as kvm will still handle the cpu state itself in that case.
The state is not synced back, thus kvm won't have a chance to actively modify
the cpu state. To do so, control has to be given back to QEMU (which is already
done so in all relevant cases).
Setting of the cpu state can fail either because kvm doesn't support the
interface yet, or because the state is invalid/not supported. Failed attempts
will be traced
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
CC: Andreas Faerber <afaerber@suse.de>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
This patch makes sure that halting a cpu and stopping a cpu are two different
things. Stopping a cpu will also set the cpu halted - this is needed for common
infrastructure to work (note that the stop and stopped flag cannot be used for
our purpose because they are already used by other mechanisms).
A cpu can be halted ("waiting") when it is operating. If interrupts are
disabled, this is called a "disabled wait", as it can't be woken up anymore. A
stopped cpu is treated like a "disabled wait" cpu, but in order to prepare for a
proper cpu state synchronization with the kvm part, we need to track the real
logical state of a cpu.
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
CC: Andreas Faerber <afaerber@suse.de>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
kvmclock fix. This was reverted last minute for 2.1, but it is now back
with the problematic case fixed.
Note: I will soon switch to a subkey for signing purposes. To verify
future signed pull requests from me, please update my key with
"gpg --recv-keys 9B4D86F2". You should see 3 new subkeys---the
one for signing will be a 2048-bit RSA key, 4E6B09D7.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJUJXmEAAoJEBvWZb6bTYbyuZoP+gJScjcoHVL27YL6GqCsLzQ+
swk/6TDaboR5+t3Tt1vHOWOvtXmoZlxp+sQU50ltiKS1mwRTsMAhEeTInTNPLJ6z
aKX46QdvNib2bYdUgnl8nqOiKQAZiA6rR8os2lWJ3E8n1n6qBbeWwneQK2YvRaVf
RMjmDm7sjhZt5h868O7SPNDSjk8k5TgYj5MOsDugFxQr6JXGxt8xmB+Ml6v6Ne+y
ufDchXtBhANIkr1aNMeOBYdX+/wuY+fltjnG11xpuAL1a63F/b3EGvPoqUHQCZSd
PzHmkSZV0z3rmgVTpcAHxSw2MxUEgmXLvxWAQTahrzGUi8mH3hF7z7zNqfULZWa1
3hctUdlf9aWbIgdfEdBZqOjol58TYvgCjIa7W2upsvEf7sjctpCZnMmBi8vMM3FQ
qv7S4139jN70+7zNY8CCnXPuOY63rsFTAs8XCPF7DL3HESzm7x8MEFgW7/7iJtwQ
PNJVbJX8+FpDmdAy/fwkMKA8BvjmNBuIZrr9/IfFJ6uM70//brq3OXq25agToBpZ
QYWW5CXYeQXurPBK1Q6XkGx61mNuc5zOveVS+8MhJKw9SiL/cIhQzrh3fgFtIP+e
63LlufFPMA2+fhBJctptIoU9+XeUwalQic92ZqepuXB/OZV6vhhV9/K2zZMUMpl9
B/Gg3cd1w5rMHg6O2kov
=nP/e
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
Usual mix of patches, the most important being Alex and Marcelo's
kvmclock fix. This was reverted last minute for 2.1, but it is now back
with the problematic case fixed.
Note: I will soon switch to a subkey for signing purposes. To verify
future signed pull requests from me, please update my key with
"gpg --recv-keys 9B4D86F2". You should see 3 new subkeys---the
one for signing will be a 2048-bit RSA key, 4E6B09D7.
# gpg: Signature made Fri 26 Sep 2014 15:34:44 BST using RSA key ID 9B4D86F2
# gpg: Good signature from "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: aka "Paolo Bonzini <bonzini@gnu.org>"
* remotes/bonzini/tags/for-upstream:
kvm/valgrind: don't mark memory as initialized
po: fix conflict with %.mo rule in rules.mak
kvmvapic: fix migration when VM paused and when not running Windows
serial: check if backed by a physical serial port at realize time
serial: reset state at startup
target-i386: update fp status fix
hw/dma/i8257: Silence phony error message
kvmclock: Ensure time in migration never goes backward
kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation
Introduce cpu_clean_all_dirty
pit: fix pit interrupt can't inject into vm after migration
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This exceeded the trace argument limit for LTTNG UST and wasn't really
needed as the flags value is stored anyway. Dropping this fixes the
compile failure for UST. It can probably be merged with the previous
trace shortening patch.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Recent traces rework introduced 2 tracepoints with 13 and 20
arguments. When dtrace backend is selected
(--enable-trace-backend=dtrace), compile fails as
sys/sdt.h defines DTRACE_PROBE up to DTRACE_PROBE12 only.
This splits long tracepoints.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A few files have been renamed without updating their comment here. A
few events have been added in the wrong place. Clean that up.
Comments with no space after the '#' look ugly and confuse
cleanup-trace-events.pl. Insert a space.
scripts/cleanup-trace-events.pl is now happy again.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411476811-24251-5-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Event monitor_protocol_event is unused since commit 7517517. Drop it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411476811-24251-4-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Event megasas_io_read was added in commit e8f943c, but never used.
Drop it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411476811-24251-3-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
iscsi_aio_write16_cb, iscsi_aio_writev, iscsi_aio_read16_cb, and
iscsi_aio_readv have not not been in use since commit
063c3378a9 ("block/iscsi: introduce
bdrv_co_{readv, writev, flush_to_disk}").
These were the only trace events in block/iscsi.c so drop the the
trace.h include.
Cc: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411394595-15300-4-git-send-email-stefanha@redhat.com
This trace event was added in commit
840a178c94 ("usb: mtp filesharing") but
never used.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411394595-15300-3-git-send-email-stefanha@redhat.com
This trace event has not been in use since commit
b002254dbd ("virtio-blk: Unify
{non-,}dataplane's request handlings").
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1411394595-15300-2-git-send-email-stefanha@redhat.com
This converts many kinds of debug prints to traces.
This implements packets logging to avoid unnecessary calculations if
usb_ohci_td_pkt_short/usb_ohci_td_pkt_long is not enabled.
This makes OHCI errors (such as "DMA error") invisible by default.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Convert into trace event. Otherwise the message
dma: unregistered DMA channel used nchan=0 dma_pos=0 dma_len=1
gets printed every time and fills up the log-file with 50 MiB / minute.
Signed-off-by: Philipp Hahn <hahn@univention.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With this patch the qemu console core stops using PixelFormat and pixman
format codes side-by-side, pixman format code is the primary way to
specify the DisplaySurface format:
* DisplaySurface stops carrying a PixelFormat field.
* qemu_create_displaysurface_from() expects a pixman format now.
Functions to convert PixelFormat to pixman_format_code_t (and back)
exist for those who still use PixelFormat. As PixelFormat allows
easy access to masks and shifts it will probably continue to exist.
[ xenfb added by Benjamin Herrenschmidt ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add some trace events to virtio-rng for easier debugging
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This adds a couple of tcg specific trace-events which are useful for
tracing execution though tcg generated blocks. It's been tested with
lttng user space tracing but is generic enough for all systems. The tcg
events are:
* translate_block - when a subject block is translated
* exec_tb - when a translated block is entered
* exec_tb_exit - when we exit the translated code
* exec_tb_nocache - special case translations
Of course we can only trace the entrance to the first block of a chain
as each block will jump directly to the next when it can. See the -d
nochain patch to allow more complete tracing at the expense of
performance.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We have the experience that the guest doesn't stop successfully
though it was instructed to shut down.
The root cause may be not in QEMU mostly. However, QEMU is often
suspected at the beginning just because the issue occurred in
virtualization environment.
Therefore, we need to affirm that QEMU received the shutdown
request and raised ACPI irq from "virsh shutdown" command,
virt-manger or stopping QEMU process to the VM .
So that we can affirm the problems was belonged to the Guset OS
rather than the QEMU itself.
When we stop guests by "virsh shutdown" command or virt-manger,
or stopping QEMU process, qemu_system_powerdown_request() or
qemu_system_shutdown_request() is called. Then the below functions
in main_loop_should_exit() of Vl.c are called roughly in the
following order.
if (qemu_powerdown_requested())
qemu_system_powerdown()
monitor_protocol_event(QEVENT_POWERDOWN, NULL)
OR
if(qemu_shutdown_requested()}
monitor_protocol_event(QEVENT_SHUTDOWN, NULL);
The tracepoint of monitor_protocol_event() already exists, but no
tracepoints are defined for qemu_system_powerdown_request() and
qemu_system_shutdown_request(). So this patch adds two tracepoints for
the two functions. We believe that it will become much easier to
isolate the problem mentioned above by these tracepoints.
Signed-off-by: Yang Zhiyong <yangzy.fnst@cn.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently SPAPR PHB keeps track of all allocated MSI (here and below
MSI stands for both MSI and MSIX) interrupt because
XICS used to be unable to reuse interrupts. This is a problem for
dynamic MSI reconfiguration which happens when guest reloads a driver
or performs PCI hotplug. Another problem is that the existing
implementation can enable MSI on 32 devices maximum
(SPAPR_MSIX_MAX_DEVS=32) and there is no good reason for that.
This makes use of new XICS ability to reuse interrupts.
This reorganizes MSI information storage in sPAPRPHBState. Instead of
static array of 32 descriptors (one per a PCI function), this patch adds
a GHashTable when @config_addr is a key and (first_irq, num) pair is
a value. GHashTable can dynamically grow and shrink so the initial limit
of 32 devices is gone.
This changes migration stream as @msi_table was a static array while new
@msi_devs is a dynamic hash table. This adds temporary array which is
used for migration, it is populated in "spapr_pci"::pre_save() callback
and expanded into the hash table in post_load() callback. Since
the destination side does not know the number of MSI-enabled devices
in advance and cannot pre-allocate the temporary array to receive
migration state, this makes use of new VMSTATE_STRUCT_VARRAY_ALLOC macro
which allocates the array automatically.
This resets the MSI configuration space when interrupts are released by
the ibm,change-msi RTAS call.
This fixed traces to be more informative.
This changes vmstate_spapr_pci_msi name from "...lsi" to "...msi" which
was incorrect by accident. As the internal representation changed,
thus bumps migration version number.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: drop g_malloc_n usage]
Signed-off-by: Alexander Graf <agraf@suse.de>
This implements interrupt release function so IRQs can be returned back
to the pool for reuse in cases such as PCI hot plug.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
The current allocator returns IRQ numbers from a pool and does not
support IRQs reuse in any form as it did not keep track of what it
previously returned, it only keeps the last returned IRQ. Some use
cases such as PCI hot(un)plug may require IRQ release and reallocation.
This moves an allocator from SPAPR to XICS.
This switches IRQ users to use new API.
This uses LSI/MSI flags to know if interrupt is allocated.
The interrupt release function will be posted as a separate patch.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add mhp_pc_dimm_assigned_slot & mhp_pc_dimm_assigned_address
events to trace which address and slot where assigned to
plugged in PC_DIMM device on target-i386 machine.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add events for tracing accesses to memory hotplug IO ports.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently only single TCE entry per request is supported (H_PUT_TCE).
However PAPR+ specification allows multiple entry requests such as
H_PUT_TCE_INDIRECT and H_STUFF_TCE. Having less transitions to the host
kernel via ioctls, support of these calls can accelerate IOMMU operations.
This implements H_STUFF_TCE and H_PUT_TCE_INDIRECT.
This advertises "multi-tce" capability to the guest if the host kernel
supports it (KVM_CAP_SPAPR_MULTITCE) or guest is running in TCG mode.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Modern Linux kernels support last POWERPC CPUs so when a kernel boots,
in most cases it can find a matching cpu_spec in the kernel's cpu_specs
list. However if the kernel is quite old, it may be missing a definition
of the actual CPU. To provide an ability for old kernels to work on modern
hardware, a Processor Compatibility Mode has been introduced
by the PowerISA specification.
>From the hardware prospective, it is supported by the Processor
Compatibility Register (PCR) which is defined in PowerISA. The register
enables one of the compatibility modes (2.05/2.06/2.07).
Since PCR is a hypervisor privileged register and cannot be
directly accessed from the guest, the mode selection is done via
ibm,client-architecture-support (CAS) RTAS call using which the guest
specifies what "raw" and "architected" CPU versions it supports.
QEMU works out the best match, changes a "cpu-version" property of
every CPU and notifies the guest about the change by setting these
properties in the buffer passed as a response on a custom H_CAS hypercall.
This implements ibm,client-architecture-support parameters parsing
(now only for PVRs) and cooks the device tree diff with new values for
"cpu-version", "ibm,ppc-interrupt-server#s" and
"ibm,ppc-interrupt-server#s" properties.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
The PAPR+ specification defines a ibm,client-architecture-support (CAS)
RTAS call which purpose is to provide a negotiation mechanism for
the guest and the hypervisor to work out the best compatibility parameters.
During the negotiation process, the guest provides an array of various
options and capabilities which it supports, the hypervisor adjusts
the device tree and (optionally) reboots the guest.
At the moment the Linux guest calls CAS method at early boot so SLOF
gets called. SLOF allocates a memory buffer for the device tree changes
and calls a custom KVMPPC_H_CAS hypercall. QEMU parses the options,
composes a diff for the device tree, copies it to the buffer provided
by SLOF and returns to SLOF. SLOF updates the device tree and returns
control to the guest kernel. Only then the Linux guest parses the device
tree so it is possible to avoid unnecessary reboot in most cases.
The device tree diff is a header with an update format version
(defined as 1 in this patch) followed by a device tree with the properties
which require update.
If QEMU detects that it has to reboot the guest, it silently does so
as the guest expects reboot to happen because this is usual pHyp firmware
behavior.
This defines custom KVMPPC_H_CAS hypercall. The current SLOF already
has support for it.
This implements stub which returns very basic tree (root node,
no properties) to the guest.
As the return buffer does not contain any change, no change in behavior is
expected.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
This allows guests to have a different timebase origin from the host.
This is needed for migration, where a guest can migrate from one host
to another and the two hosts might have a different timebase origin.
However, the timebase seen by the guest must not go backwards, and
should go forwards only by a small amount corresponding to the time
taken for the migration.
This is only supported for recent POWER hardware which has the TBU40
(timebase upper 40 bits) register. That includes POWER6, 7, 8 but not
970.
This adds kvm_access_one_reg() to access a special register which is not
in env->spr. This requires kvm_set_one_reg/kvm_get_one_reg patch.
The feature must be present in the host kernel.
This bumps vmstate_spapr::version_id and enables new vmstate_ppc_timebase
only for it. Since the vmstate_spapr::minimum_version_id remains
unchanged, migration from older QEMU is supported but without
vmstate_ppc_timebase.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
Exploit the new api for userspace-controlled cmma. If supported, enable
cmma during kvm initialization and register a reset handler for cmma,
which is also called directly from the load IPL code.
The reset functionality is needed to reset the cmma state of the guest
pages, e.g. if a system reset is triggered via qemu monitor; otherwise
this could result in data corruption.
A guest triggered reboot may now lead to multiple cmma resets; this is
OK, however, as this is slowpath anyway and the simplest way to achieve
the intended effects.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
* remotes/bonzini/scsi-next:
[PATCH] block/iscsi: bump year in copyright notice
block/iscsi: allow cluster_size of 4K and greater
block/iscsi: clarify the meaning of ISCSI_CHECKALLOC_THRES
block/iscsi: speed up read for unallocated sectors
block/iscsi: allow fall back to WRITE SAME without UNMAP
MAINTAINERS: mark megasas as maintained
megasas: Add MSI support
megasas: Enable MSI-X support
megasas: Implement LD_LIST_QUERY
scsi: Improve error messages more
scsi-disk: Improve error messager if can't get version number
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Nasty 0xe0 logic is gone. We map through QKeyCode now, giving us a
nice, readable mapping table.
Quick smoke test in OpenFirmware looks ok. Careful check from arch
maintainers would be very nice, especially on the capslock and numlock
logic. I'm not fully sure whenever I got it translated correctly and
also what it is supposed to do in the first place ...
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
s390x introduced helper functions for getting/setting one_regs with
commit 860643bc. However, nothing about these is s390-specific.
Alexey Kardashevskiy had already posted a general version, so let's
merge the two patches and massage the code a bit.
CC: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some hardware instances do support MSI, so we should do likewise.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Newer firmware implement a LD_LIST_QUERY command, and due to a driver
issue no drives might be detected if this command isn't supported.
So add emulation for this command, too.
Cc: qemu-stable@nongnu.org
Signed-off-by: Hannes Reinecke <hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some ONE_REGS on s390 are not protected by a capability. Older kernels
might not provide those and return an error. Fortunately these registers
are only critical for the migration path. There is no need to error out
on reset and normal runtime. Furthermore, these kernels don't provide
a proper dirty bitmap anyway, so let's use tracing for those errors.
Also provide generic one reg helper to simplify the code.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Implementation of a USB Media Transfer Device device for easy
filesharing. Read-only. No access control inside qemu, it will
happily export any file it is able to open to the guest, i.e.
standard unix access rights for the qemu process apply.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This replaces DPRINTF macro with tracepoints.
This moves some messages from migration.c to savevm.c.
This adds tracepoint to signal about fileds failed to migrate.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
The throttling delay calculation was using an inaccurate sector count to
calculate the time to sleep. This broke rate-limiting for the block
mirror job.
Move the delay calculation into mirror_iteration() where we know how
many sectors were transferred. This lets us calculate an accurate delay
time.
Reported-by: Joaquim Barrera <jbarrera@ac.upc.edu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This adds @idstr to savevm_section_start and savevm_section_end
tracepoints.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It might be useful for tracing migration.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Handle the new CCW_CMD_SET_IND_ADAPTER command enabling adapter interrupts
on guest request. When active, host->guest notifications will be handled
via global_indicator -> queue indicators instead of queue indicators +
subchannel I/O interrupt. Indicators for virtqueues may be present at an
offset.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
This patch introduces the hypervisor call H_GET_TCE which is basically the
reverse of H_PUT_TCE, as defined in the Power Architecture Platform
Requirements (PAPR).
The hcall H_GET_TCE is required by the kdump kernel which is calling it to
retrieve the TCE set up by the panicing kernel.
Signed-off-by: Laurent Dufour <ldufour@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
PR KVM lacks support of many SPRs in set/get one register API but it does
really break PR KVM. So convert them to switchable traces for now.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
- sclp event facility: cleanup structure. This allows to use
realize/unrealize as well as migration support via vmsd
- reboot: Two fixes that make reboot much more reliable
- ipl: make elf loading more robust
- flic interrupt controller: This allows to migrate floating
interrupts, as well as clear them on reset etc.
- enable async_pf feature of KVM on s390
- several sclp fixes and cleanups
- several sigp fixes and cleanups
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
iQIcBAABAgAGBQJTDwZVAAoJEBF7vIC1phx8lx4P/Rv+UVD9XDFFF8yHuye1am40
NpRGjdarQ/9QUkS4gqyKwYUvIjAClk5id7U2d5zrfdc8XC49AH0ZhVFMdRupaOon
AUqXjOXD5zAh9bfUcewg1EK1P1VuKcp0hyh0jFlIqk9Xmidw8N5guQ6iBoTqGJD5
UYTp0PuSqIjY1RCuF4fCTCurzRd1+J2oKcQBip7BSWlVuWZlg2/hPxoIraLezlz2
huwOU9tkSGXwSRv4C6fCcukEwlqnvkE6W0MCrHrcb2T8xYwAR2Jjs0TsscbKxb+t
lIjZRiCxBrFwOLUqGN8DMYtZPffR+cigZ5bYb4o3PPJ0DQL4vLQVd8SPMPrdJhbb
M7UOaeTclSTQuzmM/Uuc1pmrFc8PDq0dg50dT3weH2bW8aSgyqutYGpmUcm1Q6kq
JLFuyswOBr1vS9o0TlBunP4+TqJJrnGvtIQ4EbRZm7zP78mBaIIrUcAZlbgOI+XI
cSjtFXkBOCz0j28J9GSHrsWMC7RQ179TGdcH/FjDpu0dNDOxH7eH5gZPQoQDAqwC
SjstqJdIFnd0qxOB1EqcgMUxbSqQYq3hoGvJ644ZrMA3T5trBn0fSw3J9ZU/qAK7
EvOKRacMfcacIj4l0aEQgpwqVmktwIYnkfetX/QAKw/4AImJz/R9GRkmYgjCfOH8
/CUfXM71zWLEdv1o5uJ5
=toIt
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/borntraeger/tags/kvm-s390-20140227' into staging
Several features, fixes and cleanups for kvm/s390:
- sclp event facility: cleanup structure. This allows to use
realize/unrealize as well as migration support via vmsd
- reboot: Two fixes that make reboot much more reliable
- ipl: make elf loading more robust
- flic interrupt controller: This allows to migrate floating
interrupts, as well as clear them on reset etc.
- enable async_pf feature of KVM on s390
- several sclp fixes and cleanups
- several sigp fixes and cleanups
* remotes/borntraeger/tags/kvm-s390-20140227: (22 commits)
s390x/ipl: Fix crash of ELF images with arbitrary entry points
s390x/kvm: Rework priv instruction handlers
s390x/kvm: Add missing SIGP CPU RESET order
s390x/kvm: Rework SIGP INITIAL CPU RESET handler
s390x/cpu: Use ioctl to reset state in the kernel
s390-ccw.img: new binary rom to match latest fixes
s390-ccw.img: Fix sporadic errors with ccw boot image - initialize css
s390-ccw.img: Fix sporadic reboot hangs: Initialize next_idx
s390x/event-facility: exploit realize/unrealize
s390x/event-facility: add support for live migration
s390x/event-facility: code restructure
s390x/event-facility: some renaming
s390x/sclp: Fixed setting of condition code register
s390x/sclp: Add missing checks to SCLP handler
s390x/sclp: Fixed the size of sccb and code parameter
s390x/eventfacility: mask out commands
s390x/virtio-hcall: Specification exception for illegal subcodes
s390x/virtio-hcall: Add range check for hypervisor call
s390x/kvm: Fixed bad SIGP SET-ARCHITECTURE handler
s390x/async_pf: Check for apf extension and enable pfault
...
Conflicts:
linux-headers/linux/kvm.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch implements a floating-interrupt controller device (flic)
which interacts with the s390 flic kvm_device.
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Introduces two simple functions:
int kvm_device_ioctl(int fd, int type, ...);
int kvm_create_device(KVMState *s, uint64_t type, bool test);
These functions wrap the basic ioctl-based interactions with KVM in a
way similar to other KVM ioctl wrappers.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Message-id: 1392687720-26806-4-git-send-email-christoffer.dall@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
s/offet/offset/
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
n_start can be actually calculated from offset. The number of
sectors to be allocated(n_end - n_start) can be passed in in
num. By removing n_start and n_end, we can save two parameters.
The side effect is there is a bug in qcow2.c:preallocate() that
passes incorrect n_start to qcow2_alloc_cluster_offset() is
fixed. The bug can be triggerred by a larger cluster size than
the default value(65536), for example:
./qemu-img create -f qcow2 \
-o 'cluster_size=131072,preallocation=metadata' file.img 4G
Signed-off-by: Hu Tao <hutao@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds support for special usb descriptors used by microsoft
windows. They allow more fine-grained control over driver binding and
adding entries to the registry for configuration.
As this is a guest-visible change the "msos-desc" compat property
has been added to turn this off for 1.7 + older
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
# By Paolo Bonzini (17) and others
# Via Stefan Hajnoczi
* stefanha/block: (48 commits)
qemu-iotests: filter QEMU monitor \r\n
aio: make aio_poll(ctx, true) block with no fds
block: clean up bdrv_drain_all() throttling comments
qcow2: use start_of_cluster() and offset_into_cluster() everywhere
qemu-img: decrease progress update interval on convert
qemu-img: round down request length to an aligned sector
qemu-img: dynamically adjust iobuffer size during convert
block/iscsi: set bs->bl.opt_transfer_length
block: add opt_transfer_length to BlockLimits
block/iscsi: set bdi->cluster_size
qemu-img: fix usage instruction for qemu-img convert
qemu-img: add support for skipping zeroes in input during convert
qemu-nbd: add doc for option -f
qemu-iotests: add test for snapshot in qemu-img convert
qemu-img: add -l for snapshot in convert
qemu-iotests: add 058 internal snapshot export with qemu-nbd case
qemu-nbd: support internal snapshot export
snapshot: distinguish id and name in load_tmp
qemu-iotests: Split qcow2 only cases in 048
qemu-iotests: Clean up spaces in usage output
...
Message-id: 1386347807-27359-1-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Bugfixes for uas emulation.
Add remote wakeup support for ehci.
Add suspend support for xhci.
Misc minor tweaks and fixes.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABAgAGBQJSmEXxAAoJEEy22O7T6HE47t0QALonQORRj0IUAH0cOdfAhlQ3
tGMQksBCYevBatKt4iZQgkw6H0jwse6QfsgsG2dfznEO+ZWsrt9cxe1UrqxbK2PN
2PY/I9Ke1iP6tjcf9ftjqt+mZcAg/FHrbua5hb8zXRQnqu2jr0y3Cp7k2Jax4j4d
Zl2FJ+sd4lGNR3Qpb85Muxtii8XERmMqvAit72VN4VAW4iE+SQAFSOgzBC512b55
wLVc6DrbnM8I4AVJQ8RH2pMQau0/aBHFbU8By2RKbymkJmIG2nFqLH6eSJ19QgzY
CmX8yGDJM5LGAGRZCeDSeuilxFU/WCSoTtkL8cPcYUv4cSTm+forzxhVz+CVOeVu
JJsWNkaIxu4mxfRyADjUKkWoKX7ACro3ErfAWHdv8hwuhZ4uD6cf2++nXVDK9dq4
yLL2nR4YG0NTOdQNKrsUbltf9gC5cWqNRgVMJ5VfqIBGtjXdTbpGpcUEFuDDegjk
GhfN8lcpqgnFj0U4fAGLxHYXHvJRpNeWzEEANPuEYnWr2tSrgBWKkYLaooTDHt5r
FUE6lmKL+BzQYnXfWWqh1fZoiBzzrMaT3OkHc2vx/SrGLuO/rVWTzXsFQI+NGPHp
XxuyocFoKZA2yGr9h6eBBp9mtd5y0oOVxBR0WbkgvmbyxkX7Zq9r2PSoDOm26oE3
5kmApAnSij83aT06Qe8P
=2yvC
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'kraxel/tags/pull-usb-1' into staging
Improvements for usb3 bulk stream (usb core, xhci).
Bugfixes for uas emulation.
Add remote wakeup support for ehci.
Add suspend support for xhci.
Misc minor tweaks and fixes.
# gpg: Signature made Thu 28 Nov 2013 11:44:49 PM PST using RSA key ID D3E87138
# gpg: Can't check signature: public key not found
# By Hans de Goede (11) and others
# Via Gerd Hoffmann
* kraxel/tags/pull-usb-1:
usb: move usb_{hi,lo} helpers to header file.
usb: add vendor request defines
trace-events: Clean up after removal of old usb-host code
Revert "usb-tablet: Don't claim wakeup capability for USB-2 version"
ehci: implement port wakeup
xhci: Call usb_device_alloc/free_streams
usb: Add usb_device_alloc/free_streams
usb: Add max_streams attribute to endpoint info
uas: s/ui/iu/
uas: Fix response iu struct definition
uas: Bounds check tags when using streams
uas: Streams are numbered 1-y, rather then 0-x
uas: Fix / cleanup usb_uas_task error handling
uas: Only use report iu-s for task_mgmt status reporting
scsi: Add 2 new sense codes needed by uas
xhci: add support for suspend/resume
xhci: Add a few missing checks for disconnected devices
Message-id: 1385712381-30918-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@amazon.com>
Writing zeroes to a file can be done by punching a hole if
MAY_UNMAP is set.
Note that in this case ENOTSUP is not ignored, but makes
the block layer fall back to the generic implementation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will be used by the SCSI layer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit b5613fd neglected to drop the trace events along with the code.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Update portsc register and raise irq in case a suspended
port is woken up, so remote wakeup works on our ehci ports.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
# By Stefan Weil (8) and others
# Via Michael Tokarev
* mjt/trivial-patches:
tests/.gitignore: ignore test-throttle
exec: Fix broken build for MinGW (regression)
kvm: Fix compiler warning (clang)
tcg-sparc: Fix parenthesis warning
Makefile: Remove some more files when cleaning
target-i386: Fix segment cache dump
iov: avoid "orig_len may be used unitialized" warning
vscclient: remove unnecessary use of uninitialized variable
trace-events: Clean up with scripts/cleanup-trace-events.pl again
tci: Fix qemu-alpha on 32 bit hosts (wrong assertions)
*-user: Improve documentation for lock_user function
MAINTAINERS: Add missing entry to filelist for TCI target
translate-all: Fix formatting of dump output
*-user: Fix typo in comment (ulocking -> unlocking)
docs: Fix IO port number for CPU present bitmap.
q35: Fix typo in constant DEFUALT -> DEFAULT.
configure: Undefine _FORTIFY_SOURCE prior using it
Message-id: 1379696296-32105-1-git-send-email-mjt@msgid.tls.msk.ru
Event qxl_render_blit_guest_primary_initialized is unused since commit
c58c7b9, drop it.
Commit 42e5b4c moved hw/ppc/xics.c to hw/intc/xics.c without updating
the comment in trace-events.
"scripts/cleanup-trace-events.pl trace-events | diff trace-events" is
now clean again.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
KVM request types are normally defined using hex constants but QEMU traces
print decimal values instead, which is not very convenient.
This changes the request type format from %d to %x.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This includes pc and pci cleanups and enhancements,
and a virtio bugfix for level interrupts.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
iQEcBAABAgAGBQJSIveoAAoJECgfDbjSjVRp2C8IAL7DE0oM0jfEB5DAd8jlULHx
hA8RP21rFzyU8PwtHB+72+C1ImldBge4hvhI+qbsm6PoW3RCeV/lbESIRTiv8dCO
pGUOFmv8MfJAH+WWFsle5mRisoTksYQWWBMHCOqvmaY4JL9pBQOhCLHVhV1XfjtL
hO7uGrWmlijeILv5CxYyPMYuOEdVvRSZKzE+Fp2YKfNstiQrS5fJIlqmwCHrlneW
l2atnt2d9ZV1K8QYiGg4GRVbSAMJvA1wum+0F4gnXIz9yAeOt+Ht1s8cNKQDMouJ
r2OyVgPM9aS/XaO6ejct1Sjo7Vgh/Ublrpw3lFqV/qHix6rEHwy2I3JHFEJPjvk=
=SytJ
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pc,pci,virtio fixes and cleanups
This includes pc and pci cleanups and enhancements,
and a virtio bugfix for level interrupts.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 01 Sep 2013 03:15:36 AM CDT using RSA key ID D28D5469
# gpg: Can't check signature: public key not found
# By Michael S. Tsirkin (3) and others
# Via Michael S. Tsirkin
* mst/tags/for_anthony:
virtio_pci: fix level interrupts with irqfd
pc: reduce duplication, fix PIIX descriptions
hw: Clean up bogus default boot order
pci: add config space access traces
pc: fix regression for 64 bit PCI memory
pci: Introduce helper to retrieve a PCI device's DMA address space
Message-id: 1378023590-11109-1-git-send-email-mst@redhat.com
Signed-off-by: Anthony Liguori <anthony@codemonkey.ws>
This converts old style fprintf to traces.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[agraf: change patch subject]
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds pci_cfg_read and pci_cfg_write traces for config spaces
accesses.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is quite handy to debug softmmu targets.
Reviewed-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1375016242-32651-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Introduces a new Xen PV PCI device which will act as a binding point for
PV drivers for Xen.
The device has parameterized vendor-id, device-id and revision to allow to
be configured as a binding point for any vendor's PV drivers.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
They're all wrong since (at least) Paolo's big source tree
reorganization. Need to shuffle some event declarations around to
keep them under the correct source file comment.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Dropped event Unused since
mirror_cow 884fea4
paio_complete 47e6b25
paio_cancel 47e6b25
usb_ehci_data 0ce668b
megasas_qf_dequeue never used
megasas_handle_frame never used
megasas_io_continue never used
megasas_iovec_map_failed never used
megasas_dcmd_map_failed never used
milkymist_softusb_mouse_event 4c15ba9
xen_map_block 6506e4f
xen_unmap_block 6506e4f
qemu_spice_start 67be672
qemu_spice_stop 67be672
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If a user chooses to turn on the auto-converge migration capability
these changes detect the lack of convergence and throttle down the
guest. i.e. force the VCPUs out of the guest for some duration
and let the migration thread catchup and help converge.
Verified the convergence using the following :
- Java Warehouse workload running on a 20VCPU/256G guest(~80% busy)
- OLTP like workload running on a 80VCPU/512G guest (~80% busy)
Sample results with Java warehouse workload : (migrate speed set to 20Gb and
migrate downtime set to 4seconds).
(qemu) info migrate
capabilities: xbzrle: off auto-converge: off <----
Migration status: active
total time: 1487503 milliseconds
expected downtime: 519 milliseconds
transferred ram: 383749347 kbytes
remaining ram: 2753372 kbytes
total ram: 268444224 kbytes
duplicate: 65461532 pages
skipped: 64901568 pages
normal: 95750218 pages
normal bytes: 383000872 kbytes
dirty pages rate: 67551 pages
---
(qemu) info migrate
capabilities: xbzrle: off auto-converge: on <----
Migration status: completed
total time: 241161 milliseconds
downtime: 6373 milliseconds
transferred ram: 28235307 kbytes
remaining ram: 0 kbytes
total ram: 268444224 kbytes
duplicate: 64946416 pages
skipped: 64903523 pages
normal: 7044971 pages
normal bytes: 28179884 kbytes
Signed-off-by: Chegu Vinod <chegu_vinod@hp.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
backup_start() creates a block job that copies a point-in-time snapshot
of a block device to a target block device.
We call backup_do_cow() for each write during backup. That function
reads the original data from the block device before it gets
overwritten. The data is then written to the target device.
Currently backup cluster size is hardcoded to 65536 bytes.
[I made a number of changes to Dietmar's original patch and folded them
in to make code review easy. Here is the full list:
* Drop BackupDumpFunc interface in favor of a target block device
* Detect zero clusters with buffer_is_zero() and use bdrv_co_write_zeroes()
* Use 0 delay instead of 1us, like other block jobs
* Unify creation/start functions into backup_start()
* Simplify cleanup, free bitmap in backup_run() instead of cb
* function
* Use HBitmap to avoid duplicating bitmap code
* Use bdrv_getlength() instead of accessing ->total_sectors
* directly
* Delete the backup.h header file, it is no longer necessary
* Move ./backup.c to block/backup.c
* Remove #ifdefed out code
* Coding style and whitespace cleanups
* Use bdrv_add_before_write_notifier() instead of blockjob-specific hooks
* Keep our own in-flight CowRequest list instead of using block.c
tracked requests. This means a little code duplication but is much
simpler than trying to share the tracked requests list and use the
backup block size.
* Add on_source_error and on_target_error error handling.
* Use trace events instead of DPRINTF()
-- stefanha]
Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
# By Paolo Bonzini (11) and others
# Via Paolo Bonzini
* bonzini/iommu-for-anthony:
memory: clean up phys_page_find
memory: populate FlatView for new address spaces
memory: limit sections in the radix tree to the actual address space size
s390x: reduce TARGET_PHYS_ADDR_SPACE_BITS to 62
memory: fix address space initialization/destruction
memory: make memory_global_sync_dirty_bitmap take an AddressSpace
memory: do not duplicate memory_region_destructor_none
memory: Rename readable flag to romd_mode
memory: Replace open-coded memory_region_is_romd
memory: allow memory_region_find() to run on non-root memory regions
memory: assert that PhysPageEntry's ptr does not overflow
exec: eliminate stq_phys_notdirty
exec: make qemu_get_ram_ptr private
exec: eliminate qemu_put_ram_ptr
exec: remove obsolete comment
Message-id: 1369414987-8839-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
qemu_co_queue_next(&queue) arranges that the next queued coroutine is
run at a later point in time. This deferred restart is useful because
the caller may not want to transfer control yet.
This behavior was implemented using QEMUBH in the past, which meant that
CoQueue (and hence CoMutex and CoRwlock) had a dependency on the
AioContext event loop. This hidden dependency causes trouble when we
move to a world with multiple event loops - now qemu_co_queue_next()
needs to know which event loop to schedule the QEMUBH in.
After pondering how to stash AioContext I realized the best solution is
to not use AioContext at all. This patch implements the deferred
restart behavior purely in terms of coroutines and no longer uses
QEMUBH.
Here is how it works:
Each Coroutine has a wakeup queue that starts out empty. When
qemu_co_queue_next() is called, the next coroutine is added to our
wakeup queue. The wakeup queue is processed when we yield or terminate.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit e7a09b92b7 added a trace at each
memory freeing, but unfortunately inverted size and pointer when printing
them. Fix trace.
This also led to a compilation error on 32 bit hosts:
In file included from include/trace.h:4:0,
from trace/generated-events.c:3:
./trace/generated-tracers.h: In function ‘trace_qemu_anon_ram_free’:
./trace/generated-tracers.h:64:9: error: format ‘%zu’ expects argument of type
‘size_t’, but argument 3 has type ‘void *’ [-Werror=format]
./trace/generated-tracers.h:64:9: error: format ‘%p’ expects argument of type
‘void *’, but argument 4 has type ‘size_t’ [-Werror=format]
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1369045989-14016-1-git-send-email-hpoussin@reactos.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We switched from qemu_memalign to mmap() but then we don't modify
qemu_vfree() to do a munmap() over free(). Which we cannot do
because qemu_vfree() frees memory allocated by qemu_{mem,block}align.
Introduce a new function that does the munmap(), luckily the size is
available in the RAMBlock.
Reported-by: Amos Kong <akong@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Message-id: 1368454796-14989-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This is preparatory to the introduction of a separate freeing API.
Reported-by: Amos Kong <akong@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Message-id: 1368454796-14989-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This provides a way to detect the cast that leads to a (reproducible)
crash even when QOM cast debugging is disabled.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1368188203-3407-6-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch enable us to know exit reason of KVM_RUN. It will help us
know where the trouble is caused.
Signed-off-by: Kazuya Saito <saito.kazuya@jp.fujitsu.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch adds tracepoints at ioctl to kvm. Tracing these ioctl is
useful for clarification whether the cause of troubles is qemu or kvm.
Signed-off-by: Kazuya Saito <saito.kazuya@jp.fujitsu.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This fixes the following error:
In file included from qemu/include/trace.h:4:0,
from trace/generated-events.c:3:
./trace/generated-tracers.h: In function ‘trace_pvscsi_get_sg_list’:
./trace/generated-tracers.h:4271:9: error: format ‘%lu’ expects argument of
type ‘long unsigned int’, but argument 4 has type ‘size_t’ [-Werror=format]
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Report the supported speeds for device and port in the error message.
Also add the speeds to the tracepoint. And while being at it drop
the redundant error message in usb_desc_attach, usb_device_attach will
report the error anyway.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Yan Vugenfirer <yan@daynix.com>
[ Rename files to vmw_pvscsi, fix setting of hostStatus in
pvscsi_request_cancelled - Paolo ]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reimplement usb-host on top of libusb.
Reasons to do this:
(1) Largely rewritten from scratch, nice opportunity to kill historical
cruft.
(2) Offload usbfs handling to libusb.
(3) Have a single portable code base instead of bsd + linux variants.
(4) Bring usb-host support to any platform supported by libusbx.
For now this goes side-by-side to the existing code. That is only to
simplify regression testing though, at the end of the day I want remove
the old code and support libusb exclusively. Merge early in 1.5 cycle,
remove the old code after 1.5 release or something like this.
Thanks to qdev the old and new code can coexist nicely on linux. Just
use "-device usb-host-linux" to use the old linux driver instead of the
libusb one (which takes over the "usb-host" name).
The bsd driver isn't qdev'ified so it isn't that easy for bsd.
I didn't bother making it runtime switchable, so you have to rebuild
qemu with --disable-libusb to get back the old code.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Check for port reset first and skip everything else then.
Add sanity checks for PLS updates.
Add PLC notification when entering PLS_U0 state.
This gets host-initiated port resume going on win8.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Make gui update rate adaption code in gui_update() actually work.
Sprinkle in a tracepoint so you can see the code at work. Remove
the update rate adaption code in vnc and make vnc simply use the
generic bits instead.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Hardcode depth to 32 bpp. It effectively was that way before because
that is the default surface depth, this just makes it explicit in the
code.
Rename depth to new_depth to make it consistent with the new_width +
new_height names. In theory we can make new_depth changeable (i.e.
allow the guest to fill in -- say -- 16 there). In practice the guests
don't try, the X-Server refuses to start if you ask it to use 16bpp
depth (via DefaultDepth in the Screen section).
Always return the correct rmask+gmask+bmask values for the given
new_depth.
Fix mode setting to also verify at new_depth to make sure we have a
correct DisplaySurface, even if the current video mode happes to be
16bpp (set by vgabios via bochs vbe interface). While being at it
switch over to use qemu_create_displaysurface_from, so the surface is
backed by guest-visible video memory and we save a memcpy.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
# By Kevin Wolf (22) and Peter Lieven (1)
# Via Stefan Hajnoczi
* stefanha/block: (23 commits)
block: Fix direct use of protocols as driver for bdrv_open()
qcow2: Gather clusters in a looping loop
qcow2: Move cluster gathering to a non-looping loop
qcow2: Allow requests with multiple l2metas
qcow2: Use byte granularity in qcow2_alloc_cluster_offset()
qcow2: Prepare handle_alloc/copied() for byte granularity
qcow2: handle_copied(): Implement non-zero host_offset
qcow2: handle_copied(): Get rid of keep_clusters parameter
qcow2: handle_copied(): Get rid of nb_clusters parameter
qcow2: Factor out handle_copied()
qcow2: Clean up handle_alloc()
qcow2: Finalise interface of handle_alloc()
qcow2: handle_alloc(): Get rid of keep_clusters parameter
qcow2: handle_alloc(): Get rid of nb_clusters parameter
qcow2: Factor out handle_alloc()
qcow2: Decouple cluster allocation from cluster reuse code
qcow2: Change handle_dependency to byte granularity
qcow2: Improve check for overlapping allocations
qcow2: Handle dependencies earlier
qcow2: Remove bogus unlock of s->lock
...
This patch enables us to know RunState transition. It will be userful
for investigation when the trouble occured in special event such like
live migration, shutdown, suspend, and so on.
Signed-off-by: Kazuya Saito <saito.kazuya@jp.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Decouple DisplaySurface allocation & deallocation from DisplayState.
Replace dpy_gfx_resize + dpy_gfx_setdata with a dpy_gfx_replace_surface
function.
This handles the graphic hardware emulation.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Split callbacks into separate Ops struct. Pass DisplayChangeListener
pointer as first argument to all callbacks. Uninline a bunch of
display functions and move them from console.h to console.c
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Move global variables into a struct so multiple thread pools can be
supported in the future.
This patch does not change thread-pool.h interfaces. There is still a
global thread pool and it is not yet possible to create/destroy
individual thread pools. Moving the variables into a struct first makes
later patches easier to review.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
This patch allows to specify multiple directories where qemu should look
for data files. To implement that the behavior of the -L switch is
slightly different now: Instead of replacing the data directory the
path specified will be appended to the data directory list. So when
specifiying -L multiple times all directories specified will be checked,
in the order they are specified on the command line, instead of just the
last one.
Additionally the default paths are always appended to the directory
data list. This allows to specify a incomplete directory (such as the
seabios out/ directory) via -L. Anything not found there will be loaded
from the default paths, so you don't have to create a symlink farm for
all the rom blobs.
For trouble-shooting a tracepoint has been added, logging which blob
has been loaded from which location.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1362739344-8068-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add streams support to the xhci emulation. No secondary streams yet,
only linear stream arays are supported for now.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add a new virtio transport that uses channel commands to perform
virtio operations.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Provide a mechanism for qemu to provide fully virtual subchannels to
the guest.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Provide handlers for (most) channel I/O instructions.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Yet another optimization is to extend the mirroring iteration to include more
adjacent dirty blocks. This limits the number of I/O operations and makes
mirroring efficient even with a small granularity. Most of the infrastructure
is already in place; we only need to put a loop around the computation of
the origin and sector count of the iteration.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
With AIO support in place, we can start copying more than one chunk
in parallel. This patch introduces the required infrastructure for
this: the buffer is split into multiple granularity-sized chunks,
and there is a free list to access them.
Because of copy-on-write, a single operation may already require
multiple chunks to be available on the free list.
In addition, two different iterations on the HBitmap may want to
copy the same cluster. We avoid this by keeping a bitmap of in-flight
I/O operations, and blocking until the previous iteration completes.
This should be a pretty rare occurrence, though; as long as there is
no overlap the next iteration can start before the previous one finishes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is really no change in the behavior of the job here, since
there is still a maximum of one in-flight I/O operation between
the source and the target. However, this patch already introduces
the AIO callbacks (which are unmodified in the next patch)
and some of the logic to count in-flight operations and only
complete the job when there is none.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When mirroring runs, the backing files for the target may not yet be
ready. However, this means that a copy-on-write operation on the target
would fill the missing sectors with zeros. Copy-on-write only happens
if the granularity of the dirty bitmap is smaller than the cluster size
(and only for clusters that are allocated in the source after the job
has started copying). So far, the granularity was fixed to 1MB; to avoid
the problem we detected the situation and required the backing files to
be available in that case only.
However, we want to lower the granularity for efficiency, so we need
a better solution. The solution is to always copy a whole cluster the
first time it is touched. The code keeps a bitmap of clusters that
have already been allocated by the mirroring job, and only does "manual"
copy-on-write if the chunk being copied is zero in the bitmap.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This actually uses the dirty bitmap in the block layer, and converts
mirroring to use an HBitmapIter.
Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts)
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
HBitmaps provides an array of bits. The bits are stored as usual in an
array of unsigned longs, but HBitmap is also optimized to provide fast
iteration over set bits; going from one bit to the next is O(logB n)
worst case, with B = sizeof(long) * CHAR_BIT: the result is low enough
that the number of levels is in fact fixed.
In order to do this, it stacks multiple bitmaps with progressively coarser
granularity; in all levels except the last, bit N is set iff the N-th
unsigned long is nonzero in the immediately next level. When iteration
completes on the last level it can examine the 2nd-last level to quickly
skip entire words, and even do so recursively to skip blocks of 64 words or
powers thereof (32 on 32-bit machines).
Given an index in the bitmap, it can be split in group of bits like
this (for the 64-bit case):
bits 0-57 => word in the last bitmap | bits 58-63 => bit in the word
bits 0-51 => word in the 2nd-last bitmap | bits 52-57 => bit in the word
bits 0-45 => word in the 3rd-last bitmap | bits 46-51 => bit in the word
So it is easy to move up simply by shifting the index right by
log2(BITS_PER_LONG) bits. To move down, you shift the index left
similarly, and add the word index within the group. Iteration uses
ffs (find first set bit) to find the next word to examine; this
operation can be done in constant time in most current architectures.
Setting or clearing a range of m bits on all levels, the work to perform
is O(m + m/W + m/W^2 + ...), which is O(m) like on a regular bitmap.
When iterating on a bitmap, each bit (on any level) is only visited
once. Hence, The total cost of visiting a bitmap with m bits in it is
the number of bits that are set in all bitmaps. Unless the bitmap is
extremely sparse, this is also O(m + m/W + m/W^2 + ...), so the amortized
cost of advancing from one bit to the next is usually constant.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Many callers pass size_t, which gets silently truncated to uint32_t.
Harmless, because all practical sizes are well below 4GiB. Clean it
up anyway. Size overflow now fails assertions.
Bonus: saves a whole bunch of silly casts.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
virtio-blk-data-plane is a subset implementation of virtio-blk. It only
handles read, write, and flush requests. It does this using a dedicated
thread that executes an epoll(2)-based event loop and processes I/O
using Linux AIO.
This approach performs very well but can be used for raw image files
only. The number of IOPS achieved has been reported to be several times
higher than the existing virtio-blk implementation.
Eventually it should be possible to unify virtio-blk-data-plane with the
main body of QEMU code once the block layer and hardware emulation is
able to run outside the global mutex.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>