Commit Graph

47576 Commits

Author SHA1 Message Date
Cao jin f077f88912 mptsas: Fix a migration compatible issue
My previous commit 2e2aa316 removed internal flag msi_in_use, which
exists in vmstate, use VMSTATE_UNUSED for migration compatibility.

Reported-by: Amit Shah <amit.shah@redhat.com>
Suggested-by: Amit Shah <amit.shah@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
2016-07-29 06:09:55 +03:00
Marc-André Lureau 499c557975 vhost: do not update last avail idx on get_vring_base() failure
The state.num value will probably be 0 in this case, but that
doesn't make sense to update.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 05:47:17 +03:00
Marc-André Lureau 950d94ba06 vhost: add vhost_net_set_backend()
Not all vhost-user backends support ops->vhost_net_set_backend(). It is
a nicer to provide an assert/error than to crash trying to
call. Furthermore, it improves a bit the code by hiding vhost_ops
details.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:49 +03:00
Marc-André Lureau f6b8571041 vhost-user: add error report in vhost_user_write()
Similar to vhost_user_read() error report, it is useful to have early
error report.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:49 +03:00
Marc-André Lureau 69179fe2fc tests: fix vhost-user-test leak
Spotted by valgrind.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:49 +03:00
Marc-André Lureau 1ec3b71cde tests: plug some leaks in virtio-net-test
Found thanks to valgrind.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:49 +03:00
Marc-André Lureau c89804d674 vhost-user: wait until backend init is completed
The chardev waits for an initial connection before starting qemu, and
vhost-user should wait for the backend negotiation to be completed
before starting qemu too.

vhost-user is started in the net_vhost_user_event callback, which is
synchronously called after the socket is connected. Use a
VhostUserState.started flag to indicate vhost-user init completed
successfully and qemu can be started.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:49 +03:00
Marc-André Lureau d7a04fd7d5 char: add and use tcp_chr_wait_connected
Add a chr_wait_connected for the tcp backend, and use it in the
open_socket() function.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:48 +03:00
Marc-André Lureau 6b6723c3b5 char: add chr_wait_connected callback
A function to wait on the backend to be connected, to be used in the
following patches.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:48 +03:00
Marc-André Lureau 8695de0fcf vhost: add assert() to check runtime behaviour
All these functions must be called only after the backend is connected.
They are called from virtio-net.c, after either virtio or link status
change.

The check for nc->peer->link_down should ensure vhost_net_{start,stop}()
are always called between vhost_user_{start,stop}().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:48 +03:00
Marc-André Lureau 51f7aca973 vhost-net: vhost_migration_done is vhost-user specific
Either the callback is mandatory to implement, in which case an assert()
is more appropriate, or it's not and we can't tell much whether the
function should fail or not (given it's name, I guess it should silently
success by default). Instead, make the implementation mandatory and
vhost-user specific to be more clear about its usage.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:48 +03:00
Marc-André Lureau bb12e761e8 Revert "vhost-net: do not crash if backend is not present"
Now that get_vhost_net() returns non-null after a successful
vhost_net_init(), we no longer need to check this case.

This reverts commit ecd34898596c60f79886061618dd7e01001113ad.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:48 +03:00
Marc-André Lureau 1a5b68cee8 vhost-user: add get_vhost_net() assertions
Add a few assertions to be more explicit about the runtime behaviour
after the previous patch: get_vhost_net() is non-null after
net_vhost_user_init().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:48 +03:00
Marc-André Lureau e6bcb1b617 vhost-user: keep vhost_net after a disconnection
Many code paths assume get_vhost_net() returns non-null.

Keep VhostUserState.vhost_net after a successful vhost_net_init(),
instead of freeing it in vhost_net_cleanup().

VhostUserState.vhost_net is thus freed before after being recreated or
on final vhost_user_cleanup() and there is no need to save the acked
features.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:48 +03:00
Marc-André Lureau c4843a45e3 vhost-user: check vhost_user_{read,write}() return value
The vhost-user code is quite inconsistent with error handling. Instead
of ignoring some return values of read/write and silently going on with
invalid state (invalid read for example), break the code flow when the
error happened.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:48 +03:00
Marc-André Lureau 6fab2f3f60 vhost-user: check qemu_chr_fe_set_msgfds() return value
Check qemu_chr_fe_set_msgfds() for errors, to make sure the message to
be sent is correct.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:48 +03:00
Marc-André Lureau df3485a148 vhost-user: call set_msgfds unconditionally
It is fine to call set_msgfds() with 0 fd, and ensures any previous fd
array is cleared.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:47 +03:00
Marc-André Lureau 5c7eaabf65 qemu-char: fix qemu_chr_fe_set_msgfds() crash when disconnected
Calling qemu_chr_fe_set_msgfds() on unconnected socket leads to crash
since s->ioc is NULL in this case. Return an error earlier instead.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:47 +03:00
Marc-André Lureau 4afba63120 vhost: use error_report() instead of fprintf(stderr,...)
Let's use qemu proper error reporting API, this ensures the error is
reported at the right place (stderr or monitor), with a conventional
format.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:47 +03:00
Marc-André Lureau c640969216 vhost: add missing VHOST_OPS_DEBUG
Add missing VHOST_OPS_DEBUG() logs, for completeness.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:47 +03:00
Marc-André Lureau 162bba7fa8 vhost: do not assert() on vhost_ops failure
Calling a vhost operation may fail, for example with disconnected
vhost-user backend, but qemu shouldn't abort in this case.

Log an error instead, except on error and cleanup code paths where it
can be mostly ignored.

Let's use a VHOST_OPS_DEBUG macro to easily disable those messages once
disconnected backend stabilizes.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:47 +03:00
Marc-André Lureau a06db3ec72 vhost: fix calling vhost_dev_cleanup() after vhost_dev_init()
vhost_net_init() calls vhost_dev_init() and in case of failure, calls
vhost_dev_cleanup() directly. However, the structure is already
partially cleaned on error. Calling vhost_dev_cleanup() again will call
vhost_virtqueue_cleanup() on already clean queues, and causing potential
double-close. Instead, adjust dev->nvqs and simplify vhost_dev_init()
code to not call vhost_virtqueue_cleanup() but vhost_dev_cleanup()
instead.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:47 +03:00
Marc-André Lureau f1a0365b68 vhost-net: always call vhost_dev_cleanup() on failure
vhost_dev_init(), calling vhost backend initialization, should be
cleaned up after failure too. Call vhost_dev_cleanup() in all failure
cases. First, it needs to zero-alloc the struct to avoid the initial
garbage.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:47 +03:00
Marc-André Lureau e0547b59dc vhost: make vhost_dev_cleanup() idempotent
It is called on multiple code path, so make it safe to call several
times (note: I don't remember a reproducer here, but a function called
'cleanup' should probably be idempotent in my book)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:47 +03:00
Marc-André Lureau 5be5f9be72 vhost: fix cleanup on not fully initialized device
If vhost_dev_init() failed, caller may still call vhost_dev_cleanup()
later. However, vhost_dev_cleanup() tries to remove the device from the
list even if it wasn't yet added, which may lead to crashes. Similarly
for the memory listener.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:46 +03:00
Marc-André Lureau 7b527247f0 vhost: assert the log was cleaned up
Make sure the log was released on cleanup, or it will leak (the
alternative is to call vhost_log_put() unconditionally, but it may hide
some dev state issues).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:46 +03:00
Marc-André Lureau 9e0bc24fa5 vhost: make vhost_log_put() idempotent
Although not strictly required, it is nice to have vhost_log_put()
safely callable multiple times.

Clear dev->log* when calling vhost_log_put() to make the function
idempotent. This also simplifies a bit the caller work.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:46 +03:00
Marc-André Lureau 7cb8a9b9f2 vhost: don't assume opaque is a fd, use backend cleanup
vhost-dev opaque isn't necessarily an fd, it can be a chardev when using
vhost-user. Goto fail, so vhost_backend_cleanup() is called to handle
backend cleanup appropriately.

vhost_set_backend_type() should never fail, use an assert().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:46 +03:00
Marc-André Lureau 9c7d18b3a5 vhost-user: disconnect on HUP
In some cases, qemu_chr_fe_read_all() on HUP event doesn't raise
CHR_EVENT_CLOSED because the read/recv function returns -1 on
disconnected peers (for example with tch_chr_recv, an ECONNRESET errno
overwritten as EIO).

It is simpler to explicitely disconnect on HUP, rising CHR_EVENT_CLOSED
if it wasn't disconnected already.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:46 +03:00
Marc-André Lureau d9d261142d vhost-user: minor simplification
Shorten the code and make it more clear by using the specialized
function g_str_has_prefix().

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:46 +03:00
Marc-André Lureau 01edc230d9 misc: indentation
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:33:46 +03:00
Prasad J Pandit 1e7aed7014 virtio: check vring descriptor buffer length
virtio back end uses set of buffers to facilitate I/O operations.
An infinite loop unfolds in virtqueue_pop() if a buffer was
of zero size. Add check to avoid it.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-07-29 00:07:10 +03:00
Marcel Apfelbaum 9a4c0e220d hw/virtio-pci: fix virtio behaviour
Enable transitional virtio devices by default.
Enable virtio-1.0 for devices plugged into
PCIe ports (Root ports or Downstream ports).

Using the virtio-1 mode will remove the limitation
of the number of devices that can be attached to a machine
by removing the need for the IO BAR.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-07-29 00:07:10 +03:00
Wei Jiangang be0d9760d7 apb: convert init to realize
Convert a device model where initialization obviously can't fail,
make it implement realize() rather than init().

Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-07-29 00:07:09 +03:00
Wei Jiangang 86395eb31f hw/pci-bridge: Convert pxb initialization functions to Error
Firstly, convert pxb_dev_init_common() to Error and rename
it to pxb_dev_realize_common().
Actually, pxb_register_bus() is converted as well.

And then,
convert pxb_dev_initfn() and pxb_pcie_dev_initfn() to Error,
rename them to pxb_dev_realize() and pxb_pcie_dev_realize()
respectively.

Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:07:09 +03:00
Marcel Apfelbaum 16de88a416 hw/apci: handle 64-bit MMIO regions correctly
In build_crs(), the calculation and merging of the ranges already happens
in 64-bit, but the entry boundaries are silently truncated to 32-bit in the
call to aml_dword_memory(). Fix it by handling the 64-bit MMIO ranges separately.
This fixes 64-bit BARs behind PXBs.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:07:09 +03:00
Marcel Apfelbaum 2df5a7b52f acpi: refactor pxb crs computation
Instead of always passing both IO and MEM ranges when
computing CRS ranges, define a new CrsRangeSet structure
that include them both.

This is done before introducing a third type of range,
64-bit MEM, so it will be easier to pass them all around.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:07:09 +03:00
Marcel Apfelbaum c99cb18eeb hw/acpi: fix a DSDT table issue when a pxb is present.
PXBs do not support hotplug so they don't have a PCNT function.
Since the PXB's PCI root-bus is a child bus of bus 0, the
build_dsdt code will add a call to the corresponding PCNT function.

Fix this by skipping the PCNT call for the above case.
While at it skip also PCIe child buses.

Reported-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:07:09 +03:00
Marcel Apfelbaum 7b346c742c hw/pxb: declare pxb devices as not hot-pluggable
Prevent future issues when hotplug will work for devices
attached to pxbs.

Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:07:09 +03:00
Marcel Apfelbaum 2c533c5479 hw/pcie-root-port: Fix PCIe root port initialization
Specify the root port interrupt pin as part of the init
process for cases when msi/msix are not enabled.

Fixes "hw/pci/pci.c:196:23: runtime error: shift exponent -1 is negative"
warning from clang's sanitizer.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:07:09 +03:00
Michael S. Tsirkin 6b4495401b pcie: fix link active status bit migration
We changed link status register in pci express endpoint capability
over time. Specifically,

commit b2101eae63 ("pcie: Set the "link
active" in the link status register") set data link layer link active
bit in this register without adding compatibility to old machine types.

When migrating from qemu 2.3 and older this affects xhci devices which
under machine type 2.0 and older have a pci express endpoint capability
even if they are on a pci bus.

Add compatibility flags to make this bit value match what it was under
2.3.

Additionally, to avoid breaking migration from qemu 2.3 and up,
suppress checking link status during migration: this seems sane
since hardware can change link status at any time.

https://bugzilla.redhat.com/show_bug.cgi?id=1352860

Reported-by: Gerd Hoffmann <kraxel@redhat.com>
Fixes: b2101eae63
    ("pcie: Set the "link active" in the link status register")
Cc: qemu-stable@nongnu.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-07-29 00:07:08 +03:00
Peter Maydell 2d2e632ad0 Update version for v2.7.0-rc0 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-22 15:32:42 +01:00
Peter Maydell 01a720125f target-sh4: Use glib allocator in movcal helper
Coverity spots that helper_movcal() calls malloc() but doesn't
check for failure. Fix this by switching to the glib allocation
functions, which abort on allocation failure.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1468327859-21385-1-git-send-email-peter.maydell@linaro.org
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
2016-07-22 11:33:24 +01:00
Peter Maydell e3643d32ee Migration:
- Fix a postcopy bug
 - Add a testsuite for measuring migration performance
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJXkdG8AAoJEOsLTfxlfvZwCxQQAJIllwdcPvDHqRKkGt5WudcF
 HcyqwNqZ75/zfHUpmwx6nou+xis+iO0hB0KGbVzNxRn/YRRYo1xYEzvOdrMZ6sDS
 2Np3xFVt9PVD5i5fZEJhkNXUCBUbMQ5fYOLE5wYjFGgE50uEbHpKfBvXoqTz3/Ro
 rtjVGiRLyy5Fbua7/9X4HLDXe5M4zDI+AwtnuzqU92teioF/FrDR9Ul6HFApcLBH
 kOiAy5eS3viR86poCvuh66IROHJ8BPHAMiq1c5vHPBToL0gvs4bFRLN2KRPotOMe
 2xafJz3Z48YeJNzWrZu65Z3hWN49JCD9OlWWhFHBBNO9h752QnMqzifKDKEaU0lO
 acfFllE/kALKGo7vTaUH8PlibNen6y9VSWhiII28iq+LhymA4LSn2Xg3PwE2U+5R
 tqqE6eYw5Jz5cV+GgFMFdHvTKFcmlvDHoDRi4IMTW6oGpydVVkw/wh03GlFNdVqk
 nj1WZGOyG+oR/h9ZMDm4fPtWuoPtS8WKjsmsOm0Dfo4FAMPoz2au5ORgHZE25pvq
 G6CnhNxYvKPEnLGjF67CyBlge79C/+pnLOePPmKteSK7odSit3ymi3ia1W6XFo/5
 +hSm0EbIeaPf8oBfBj6IE95rPRsPGR+1CFRhTMzNhxGYY7uFPsCDe9mAaXukCf29
 HE3ZoDwgVQoAY3dPsrDo
 =wFcD
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/amit-migration/tags/migration-for-2.7-6' into staging

Migration:
- Fix a postcopy bug
- Add a testsuite for measuring migration performance

# gpg: Signature made Fri 22 Jul 2016 08:56:44 BST
# gpg:                using RSA key 0xEB0B4DFC657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"
# Primary key fingerprint: 48CA 3722 5FE7 F4A8 B337  2735 1E9A 3B5F 8540 83B6
#      Subkey fingerprint: CC63 D332 AB8F 4617 4529  6534 EB0B 4DFC 657E F670

* remotes/amit-migration/tags/migration-for-2.7-6:
  tests: introduce a framework for testing migration performance
  scripts: ensure monitor socket has SO_REUSEADDR set
  scripts: set timeout when waiting for qemu monitor connection
  scripts: refactor the VM class in iotests for reuse
  scripts: add a 'debug' parameter to QEMUMonitorProtocol
  scripts: add __init__.py file to scripts/qmp/
  migration: set state to post-migrate on failure

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-22 10:51:32 +01:00
Daniel P. Berrange 409437e16d tests: introduce a framework for testing migration performance
This introduces a moderately general purpose framework for
testing performance of migration.

The initial guest workload is provided by the included 'stress'
program, which is configured to spawn one thread per guest CPU
and run a maximally memory intensive workload. It will loop
over GB of memory, xor'ing each byte with data from a 4k array
of random bytes. This ensures heavy read and write load across
all of guest memory to stress the migration performance. While
running the 'stress' program will record how long it takes to
xor each GB of memory and print this data for later reporting.

The test engine will spawn a pair of QEMU processes, either on
the same host, or with the target on a remote host via ssh,
using the host kernel and a custom initrd built with 'stress'
as the /init binary. Kernel command line args are set to ensure
a fast kernel boot time (< 1 second) between launching QEMU and
the stress program starting execution.

None the less, the test engine will initially wait N seconds for
the guest workload to stablize, before starting the migration
operation. When migration is running, the engine will use pause,
post-copy, autoconverge, xbzrle compression and multithread
compression features, as well as downtime & bandwidth tuning
to encourage completion. If migration completes, the test engine
will wait N seconds again for the guest workooad to stablize on
the target host. If migration does not complete after a preset
number of iterations, it will be aborted.

While the QEMU process is running on the source host, the test
engine will sample the host CPU usage of QEMU as a whole, and
each vCPU thread. While migration is running, it will record
all the stats reported by 'query-migration'. Finally, it will
capture the output of the stress program running in the guest.

All the data produced from a single test execution is recorded
in a structured JSON file. A separate program is then able to
create interactive charts using the "plotly" python + javascript
libraries, showing the characteristics of the migration.

The data output provides visualization of the effect on guest
vCPU workloads from the migration process, the corresponding
vCPU utilization on the host, and the overall CPU hit from
QEMU on the host. This is correlated from statistics from the
migration process, such as downtime, vCPU throttling and iteration
number.

While the tests can be run individually with arbitrary parameters,
there is also a facility for producing batch reports for a number
of pre-defined scenarios / comparisons, in order to be able to
get standardized results across different hardware configurations
(eg TCP vs RDMA, or comparing different VCPU counts / memory
sizes, etc).

To use this, first you must build the initrd image

 $ make tests/migration/initrd-stress.img

To run a a one-shot test with all default parameters

 $ ./tests/migration/guestperf.py > result.json

This has many command line args for varying its behaviour.
For example, to increase the RAM size and CPU count and
bind it to specific host NUMA nodes

 $ ./tests/migration/guestperf.py \
       --mem 4 --cpus 2 \
       --src-mem-bind 0 --src-cpu-bind 0,1 \
       --dst-mem-bind 1 --dst-cpu-bind 2,3 \
       > result.json

Using mem + cpu binding is strongly recommended on NUMA
machines, otherwise the guest performance results will
vary wildly between runs of the test due to lucky/unlucky
NUMA placement, making sensible data analysis impossible.

To make it run across separate hosts:

 $ ./tests/migration/guestperf.py \
       --dst-host somehostname > result.json

To request that post-copy is enabled, with switchover
after 5 iterations

 $ ./tests/migration/guestperf.py \
       --post-copy --post-copy-iters 5 > result.json

Once a result.json file is created, a graph of the data
can be generated, showing guest workload performance per
thread and the migration iteration points:

 $ ./tests/migration/guestperf-plot.py --output result.html \
        --migration-iters --split-guest-cpu result.json

To further include host vCPU utilization and overall QEMU
utilization

 $ ./tests/migration/guestperf-plot.py --output result.html \
        --migration-iters --split-guest-cpu \
	--qemu-cpu --vcpu-cpu result.json

NB, the 'guestperf-plot.py' command requires that you have
the plotly python library installed. eg you must do

 $ pip install --user  plotly

Viewing the result.html file requires that you have the
plotly.min.js file in the same directory as the HTML
output. This js file is installed as part of the plotly
python library, so can be found in

  $HOME/.local/lib/python2.7/site-packages/plotly/offline/plotly.min.js

The guestperf-plot.py program can accept multiple json files
to plot, enabling results from different configurations to
be compared.

Finally, to run the entire standardized set of comparisons

  $ ./tests/migration/guestperf-batch.py \
       --dst-host somehost \
       --mem 4 --cpus 2 \
       --src-mem-bind 0 --src-cpu-bind 0,1 \
       --dst-mem-bind 1 --dst-cpu-bind 2,3
       --output tcp-somehost-4gb-2cpu

will store JSON files from all scenarios in the directory
named tcp-somehost-4gb-2cpu

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-22 13:23:39 +05:30
Daniel P. Berrange 168ae6c24b scripts: ensure monitor socket has SO_REUSEADDR set
If tests use a TCP based monitor socket, the connection will
go into a TIMED_WAIT state when the test exits. This will
randomly prevent the test from being re-run without a certain
time period. Set the SO_REUSEADDR flag on the socket to ensure
we can immediately re-run the tests

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-6-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-22 13:23:35 +05:30
Daniel P. Berrange 238064621f scripts: set timeout when waiting for qemu monitor connection
If QEMU fails to launch for some reason, the QEMUMonitorProtocol
class accept() method will wait forever in a socket accept call.
Set a timeout of 15 seconds so that we fail more gracefully
instead of hanging the test script forever

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-5-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-22 13:23:28 +05:30
Daniel P. Berrange 6661397446 scripts: refactor the VM class in iotests for reuse
The iotests module has a python class for controlling QEMU
processes. Pull the generic functionality out of this file
and create a scripts/qemu.py module containing a QEMUMachine
class. Put the QTest integration support into a subclass
QEMUQtestMachine.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-4-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-22 13:23:24 +05:30
Daniel P. Berrange 991e7c4650 scripts: add a 'debug' parameter to QEMUMonitorProtocol
Add a 'debug' parameter to the QEMUMonitorProtocol class
which will cause it to print out all JSON strings on
sys.stderr

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-3-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-22 13:23:17 +05:30
Daniel P. Berrange 6f7a4a81ce scripts: add __init__.py file to scripts/qmp/
When searching for modules to load, python will ignore any
sub-directory which does not contain __init__.py. This means
that both scripts and scripts/qmp/ have to be explicitly added
to the python path. By adding a __init__.py file to scripts/qmp,
we only need add scripts/ to the python path and can then simply
do 'from qmp import qmp' to load scripts/qmp/qmp.py.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1469020993-29426-2-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-22 13:23:13 +05:30