Commit Graph

138 Commits

Author SHA1 Message Date
Juan Quintela 5e58f968f4 ram: Rename flush_page_queue() to migration_page_queue_free()
It reflects better what it does.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:35 +02:00
Kevin Wolf d35ff5e6b3 block: Ignore guest dev permissions during incoming migration
Usually guest devices don't like other writers to the same image, so
they use blk_set_perm() to prevent this from happening. In the migration
phase before the VM is actually running, though, they don't have a
problem with writes to the image. On the other hand, storage migration
needs to be able to write to the image in this phase, so the restrictive
blk_set_perm() call of qdev devices breaks it.

This patch flags all BlockBackends with a qdev device as
blk->disable_perm during incoming migration, which means that the
requested permissions are stored in the BlockBackend, but not actually
applied to its root node yet.

Once migration has finished and the VM should be resumed, the
permissions are applied. If they cannot be applied (e.g. because the NBD
server used for block migration hasn't been shut down), resuming the VM
fails.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
2017-04-07 14:44:05 +02:00
Daniel P. Berrange 4af245dc3e migration: use "" as the default for tls-creds/hostname
The tls-creds parameter has a default value of NULL indicating
that TLS should not be used. Setting it to non-NULL enables
use of TLS. Once tls-creds are set to a non-NULL value via the
monitor, it isn't possible to set them back to NULL again, due
to current implementation limitations. The empty string is not
a valid QObject identifier, so this switches to use "" as the
default, indicating that TLS will not be used

The tls-hostname parameter has a default value of NULL indicating
the the hostname from the migrate connection URI should be used.
Again, once tls-hostname is set non-NULL, to override the default
hostname for x509 cert validation, it isn't possible to reset it
back to NULL via the monitor. The empty string is not a valid
hostname, so this switches to use "" as the default, indicating
that the migrate URI hostname should be used.

Using "" as the default for both, also means that the monitor
commands "info migrate_parameters" / "query-migrate-parameters"
will report existance of tls-creds/tls-parameters even when set
to their default values.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16 08:57:08 +01:00
Dr. David Alan Gilbert 67f11b5c23 postcopy: Record largest page size
Record the largest page size in use; we'll need it soon for allocating
temporary buffers.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170224182844.32452-7-dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28 11:30:23 +00:00
Ashijeet Acharya 7562f90707 migrate: Introduce a 'dc->vmsd' check to avoid segfault for --only-migratable
Commit a3a3d8c7 introduced a segfault bug while checking for
'dc->vmsd->unmigratable' which caused QEMU to crash when trying to add
devices which do no set their 'dc->vmsd' yet while initialization.
Place a 'dc->vmsd' check prior to it so that we do not segfault for
such devices.

NOTE: This doesn't compromise the functioning of --only-migratable
option as all the unmigratable devices do set their 'dc->vmsd'.

Introduce a new function check_migratable() and move the
only_migratable check inside it, also use stubs to avoid user-mode qemu
build failures.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1487009088-23891-1-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28 11:30:22 +00:00
Daniel Henrique Barboza 87c9cc1c30 Changing error message of QMP 'migrate_set_downtime' to seconds
Using QMP, the error message of 'migrate_set_downtime' was displaying
the values in milliseconds, being misleading with the command that
accepts the value in seconds:

{ "execute": "migrate_set_downtime", "arguments": {"value": 3000}}
{"error": {"class": "GenericError", "desc": "Parameter 'downtime_limit'
expects an integer in the range of 0 to 2000000 milliseconds"}}

This message is also seen in HMP when trying to set the same
parameter:

(qemu) migrate_set_parameter downtime-limit 3000000
Parameter 'downtime_limit' expects an integer in the range of 0 to
2000000 milliseconds

To allow for a proper error message when using QMP, a validation
of the user input was added in 'qmp_migrate_set_downtime'.

Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Message-Id: <20170222151729.5812-1-danielhb@linux.vnet.ibm.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-28 11:29:00 +00:00
zhanghailiang 479125d53e COLO: fix setting checkpoint-delay not working properly
If we set checkpoint-delay through command 'migrate-set-parameters',
It will not take effect until we finish last sleep chekpoint-delay,
That's will be offensive espeically when we want to change its value
from an extreme big one to a proper value.

Fix it by using timer to realize checkpoint-delay.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Message-Id: <1484657864-21708-2-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-13 17:27:13 +00:00
Pavel Butsykin ced1c6166e migration: discard non-dirty ram pages after the start of postcopy
After the start of postcopy migration there are some non-dirty pages which have
already been migrated. These pages are no longer needed on the source vm so that
we can free them and it doen't hurt to complete the migration.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Message-Id: <20170203152321.19739-4-pbutsykin@virtuozzo.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-02-13 17:27:13 +00:00
Pavel Butsykin 53f09a1076 add 'release-ram' migrate capability
This feature frees the migrated memory on the source during postcopy-ram
migration. In the second step of postcopy-ram migration when the source vm
is put on pause we can free unnecessary memory. It will allow, in particular,
to start relaxing the memory stress on the source host in a load-balancing
scenario.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Message-Id: <20170203152321.19739-3-pbutsykin@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
   Manually merged in Pavel's 'migration: madvise error_report fixup!'
2017-02-13 17:27:13 +00:00
Dr. David Alan Gilbert ef8d6488d2 postcopy: Recover block devices on early failure
An early postcopy failure can be recovered from as long as we know
we haven't sent the command to run the destination.
We have to undo the bdrv_inactivate_all by calling
bdrv_invalidate_cache_all

Note that I'm not using ms->block_inactive because once we've
sent the postcopy package we dont want anything else to try
and recover the block storage on the source; the destination
might have started writing to it.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170202155909.31784-3-dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-02-06 13:36:49 +01:00
Juan Quintela b4b076daf3 migration: create Migration Incoming State at init time
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1485207141-1941-3-git-send-email-quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-02-06 13:36:49 +01:00
Pankaj Gupta 009fad7f4c migration: Change name of live migration thread
Change the name of live migration thread from 'migration'
to 'live_migration' to identify it clearly. 'migration'
is a generic word and kernel also has  tasks for process
migration with the name 'migration/cpu#'.

Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Message-Id: <1485178976-15225-1-git-send-email-pagupta@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-01-24 18:00:31 +00:00
zhanghailiang 1d2acc3162 migration: re-active images while migration been canceled after inactive them
commit fe904ea824 fixed a case
which migration aborted QEMU because it didn't regain the control
of images while some errors happened.

Actually, there are another two cases can trigger the same error reports:
" bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed",

Case 1, codes path:
migration_thread()
    migration_completion()
        bdrv_inactivate_all() ----------------> inactivate images
        qemu_savevm_state_complete_precopy()
            socket_writev_buffer() --------> error because destination fails
                qemu_fflush() ----------------> set error on migration stream
-> qmp_migrate_cancel() ----------------> user cancelled migration concurrently
    -> migrate_set_state() ------------------> set migrate CANCELLIN
    migration_completion() -----------------> go on to fail_invalidate
	if (s->state == MIGRATION_STATUS_ACTIVE) -> Jump this branch

Case 2, codes path:
migration_thread()
    migration_completion()
        bdrv_inactivate_all() ----------------> inactivate images
    migreation_completion() finished
-> qmp_migrate_cancel() ---------------> user cancelled migration concurrently
    qemu_mutex_lock_iothread();
    qemu_bh_schedule (s->cleanup_bh);

As we can see from above, qmp_migrate_cancel can slip in whenever
migration_thread does not hold the global lock. If this happens after
bdrv_inactive_all() been called, the above error reports will appear.

To prevent this, we can call bdrv_invalidate_cache_all() in qmp_migrate_cancel()
directly if we find images become inactive.

Besides, bdrv_invalidate_cache_all() in migration_completion() doesn't have the
protection of big lock, fix it by add the missing qemu_mutex_lock_iothread();

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Message-Id: <1485244792-11248-1-git-send-email-zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-01-24 18:00:31 +00:00
Ashijeet Acharya b67b8c3a9d migration: Fail migration blocker for --only-migratable
migrate_add_blocker should rightly fail if the '--only-migratable'
option was specified and the device in use should not be able to
perform the action which results in an unmigratable VM.

Make migrate_add_blocker return -EACCES in this case.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1484566314-3987-6-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-01-24 18:00:31 +00:00
Ashijeet Acharya fe44dc9180 migration: disallow migrate_add_blocker during migration
If a migration is already in progress and somebody attempts
to add a migration blocker, this should rightly fail.

Add an errp parameter and a retcode return value to migrate_add_blocker.

Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Message-Id: <1484566314-3987-5-git-send-email-ashijeetacharya@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Merged with recent 'Allow invtsc migration' change
2017-01-24 18:00:30 +00:00
zhanghailiang fe39a4d440 migration: fix missing assignment for has_x_checkpoint_delay
We forgot to assign true to params->has_x_checkpoint_delay parameter
in qmp_query_migrate_parameters.

Without this, qmp command 'query-migrate-parameters' doesn't show the
default value for x-checkpoint-delay option.

This also fixes the fact that HMP was relying on unspecified behavior by
reading x_checkpoint_delay without checking has_x_checkpoint_delay.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-11-14 14:50:56 +01:00
Peter Maydell eab9e9629c Migration bits from the COLO project
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQItBAABCAAXBQJYFc37EBxhbWl0QGtlcm5lbC5vcmcACgkQ6wtN/GV+9nDdqQ/9
 H+di+q/zF6aOEuXRCN0ud9R81fZ7nLve3e9EbKCNZXnxN3M3+zQj0At+e/SBEoTc
 rRwqJmNlRX3TWViWsAYmddgtopZ9R9DWwW/VsKjS2Ng230YShrA/o20hu2dJkwl3
 CN4vAObzc/gxM59NWUlMnTXOG+Z9fI1NlEf0vZ2484a59KPVIE7W1zZccT1F8MNq
 sfa3RlOGBchNO2Rfzrr6cFGGH7UTfWiftPs7EDOfN/YBcpld6V4DfPWdPP8r2DSX
 gG7D3DJuuqPxZCjl0Nm1OjaIunYfVrRpMPMeNyo1+kTVbvvhDPFjah/MSWu8XJ2c
 N5lSqikIAWfMbQVW9gDpQ0495eRQTWA7VIlCbwN6mqNxyBvQMA+licFq6UFFrCMp
 quC3gO+daz3fKvUhi23TpebbqKLHA5OZA5ZvkjGDgkKPMCJyQRBZHLb81t5xsulZ
 cXuzAOeRcXK9aEJvcrDgWwxzi3PN8zg74RF1ZV8gxM4DkHKohlsnbgshWqGkFh3M
 +S5tEPqVlOlDO9juf6rlwNnVbWhFDFEGMKjI9XMTWwVWTREbqyP86fP66h2C4qc6
 34yAHi5G2i43dxzHgpHC0MpU0XenO0EYdu+8Tcx35LSBSfkOeD7pU1DeZQgbI51m
 ZQSnLDJqv2HVdoZT7vaIjwuhtEl84xqelFdVg+cY7Gw=
 =sur7
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/amit-migration/tags/migration-for-2.8' into staging

Migration bits from the COLO project

# gpg: Signature made Sun 30 Oct 2016 10:39:55 GMT
# gpg:                using RSA key 0xEB0B4DFC657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"
# Primary key fingerprint: 48CA 3722 5FE7 F4A8 B337  2735 1E9A 3B5F 8540 83B6
#      Subkey fingerprint: CC63 D332 AB8F 4617 4529  6534 EB0B 4DFC 657E F670

* remotes/amit-migration/tags/migration-for-2.8:
  MAINTAINERS: Add maintainer for COLO framework related files
  configure: Support enable/disable COLO feature
  docs: Add documentation for COLO feature
  COLO: Implement failover work for secondary VM
  COLO: Implement the process of failover for primary VM
  COLO: Introduce state to record failover process
  COLO: Add 'x-colo-lost-heartbeat' command to trigger failover
  COLO: Synchronize PVM's state to SVM periodically
  COLO: Add checkpoint-delay parameter for migrate-set-parameters
  COLO: Load VMState into QIOChannelBuffer before restore it
  COLO: Send PVM state to secondary side when do checkpoint
  COLO: Add a new RunState RUN_STATE_COLO
  COLO: Introduce checkpointing protocol
  COLO: Establish a new communicating path for COLO
  migration: Switch to COLO process after finishing loadvm
  migration: Enter into COLO mode after migration if COLO is enabled
  COLO: migrate COLO related info to secondary node
  migration: Introduce capability 'x-colo' to migration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 13:06:38 +00:00
Peter Maydell 277d44f5a6 trivial patches for 2016-10-28
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABCAAGBQJYE2wfAAoJEHAbT2saaT5ZGYUH/3QWJ4OFWbqGo1YYN5AIAheF
 v1bQGTh1HGbLk46ajhUvzB0bMHb1FC1KoOruU2wFYuKK/J5zQ+4X9EmaC/fD7hyx
 nGTcPWAyxKOlqOq3In9ro+xWQNzEhfoypKCQQVC4Y3quzub48wAro8fuFSNXLyBq
 ERvAsjgj0TrLEHoWtJl2bPYiqSd6KAHZAKPFW3Jw8MmsBcTLmnF2PVW3LBfdcHe7
 6vlhqX7lPzVlHRaUsaxRkFxYd2YGisbe3bPRDw2fTxrtOYyEkopQq7xi2Q6Yq5N0
 z0yM2oJ7o1QtUOXYa7KBf03WZ7e119HimaUkGLg+0LVhQNbeG3hd3gNwApXa5og=
 =tYml
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mjt/tags/trivial-patches-fetch' into staging

trivial patches for 2016-10-28

# gpg: Signature made Fri 28 Oct 2016 16:17:51 BST
# gpg:                using RSA key 0x701B4F6B1A693E59
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"
# Primary key fingerprint: 6EE1 95D1 886E 8FFB 810D  4324 457C E0A0 8044 65C5
#      Subkey fingerprint: 7B73 BAD6 8BE7 A2C2 8931  4B22 701B 4F6B 1A69 3E59

* remotes/mjt/tags/trivial-patches-fetch: (23 commits)
  Fix build for less common build directories names
  clean-up: removed duplicate #includes
  scripts/clean-includes: added duplicate #include check
  monitor: deprecate 'default' option
  qemu-ga: Remove stray 'q' in documentation
  Makefile: Fix help text for target 'installer'
  s390: avoid always-true comparison in s390_pci_generate_fid()
  migration: Remove unneeded NULL check from migrate_fd_error()
  scripts/hxtool: fix undefined behavour of echo
  qemu-options.hx: set: fix copy-paste error
  usb: Change *_exitfn return type from int to void
  MAINTAINERS: qemu-trivial information
  colo-compare: remove unused struct CompareChardevProps and 'props' variable
  milkymist-pfpu: fix potential integer overflow
  hw/block/nvme: Simplify if-statements a little bit
  target-lm32: rewrite gen_compare()
  lm32: milkymist-tmu2: fix integer overflow
  target-lm32: disable asm logging via LOG_DIS()
  target-lm32: swap operand of wcsr in LOG_DIS()
  target-lm32: fix LOG_DIS operand order
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-10-31 11:58:30 +00:00
zhanghailiang 68b5359187 COLO: Add checkpoint-delay parameter for migrate-set-parameters
Add checkpoint-delay parameter for migrate-set-parameters, so that
we can control the checkpoint frequency when COLO is in periodic mode.

Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang 25d0c16f62 migration: Switch to COLO process after finishing loadvm
Switch from normal migration loadvm process into COLO checkpoint process if
COLO mode is enabled.

We add three new members to struct MigrationIncomingState,
'have_colo_incoming_thread' and 'colo_incoming_thread' record the COLO
related thread for secondary VM, 'migration_incoming_co' records the
original migration incoming coroutine.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang 0b827d5e72 migration: Enter into COLO mode after migration if COLO is enabled
Add a new migration state: MIGRATION_STATUS_COLO. Migration source side
enters this state after the first live migration successfully finished
if COLO is enabled by command 'migrate_set_capability x-colo on'.

We reuse migration thread, so the process of checkpointing will be handled
in migration thread.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
zhanghailiang 35a6ed4f71 migration: Introduce capability 'x-colo' to migration
We add helper function colo_supported() to indicate whether
colo is supported or not, with which we use to control whether or not
showing 'x-colo' string to users, they can use qmp command
'query-migrate-capabilities' or hmp command 'info migrate_capabilities'
to learn if colo is supported.

The default value for COLO (COarse-Grain LOck Stepping) is disabled.

Cc: Juan Quintela <quintela@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
2016-10-30 15:17:39 +05:30
Peter Maydell 25174055f4 migration: Remove unneeded NULL check from migrate_fd_error()
All the callers of migrate_fd_error() pass a non-NULL
error parameter, and if any did pass NULL then we would
segfault in error_copy(), so remove the unnecessary
NULL check earlier in the function.
(Spotted by Coverity.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-10-28 18:17:23 +03:00
Daniel P. Berrange 6f01f136af migration: set name for all I/O channels created
Ensure that all I/O channels created for migration are given names
to distinguish their respective roles.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-10-27 09:13:10 +02:00
Ashijeet Acharya 2ff3025797 migrate: move max-bandwidth and downtime-limit to migrate_set_parameter
Mark the old commands 'migrate_set_speed' and 'migrate_set_downtime' as
deprecated.
Move max-bandwidth and downtime-limit into migrate-set-parameters for
setting maximum migration speed and expected downtime limit parameters
respectively.
Change downtime units to milliseconds (only for new-command) and set
its upper bound limit to 2000 seconds.
Update the query part in both hmp and qmp qemu control interfaces.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Ashijeet Acharya 091ecc8b69 migrate: Fix bounds check for migration parameters in migration.c
This patch fixes the out-of-bounds check of migration parameters in
qmp_migrate_set_parameters() for cpu-throttle-initial and
cpu-throttle-increment by adding a return statement for both as they
were broken since their introduction in 2.5 via commit 1626fee.
Due to the missing return statements, parameters were getting set to
out-of-bounds values despite the error.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Eric Blake 7f375e0446 migrate: Use boxed qapi for migrate-set-parameters
Now that QAPI makes it easy to pass a struct around, we don't
have to declare as many parameters or local variables.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Eric Blake de63ab6124 migrate: Share common MigrationParameters struct
It is rather verbose, and slightly error-prone, to repeat
the same set of parameters for input (migrate-set-parameters)
as for output (query-migrate-parameters), where the only
difference is whether the members are optional.  We can just
document that the optional members will always be present
on output, and then share a common struct between both
commands.  The next patch can then reduce the amount of
code needed on input.

Also, we made a mistake in qemu 2.7 of returning an empty
string during 'query-migrate-parameters' when there is no
TLS, rather than omitting TLS details entirely.  Technically,
this change risks breaking any 2.7 client that is hard-coded
to expect the parameter's existence; on the other hand, clients
that are portable to 2.6 already must be prepared for those
members to not be present.

And this gets rid of yet one more place where the QMP output
visitor is silently converting a NULL string into "" (which
is a hack I ultimately want to kill off).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2016-10-13 17:23:53 +02:00
Dr. David Alan Gilbert 42da5550d6 migration: set state to post-migrate on failure
If a migration fails/is cancelled during the postcopy stage we currently
end up with the runstate as finish-migrate, where it should be post-migrate.
There's a small window in precopy where I think the same thing can
happen, but I've never seen it.

It rarely matters; the only postcopy case is if you restart a migration, which
again is a case that rarely matters in postcopy because it's only
safe to restart the migration if you know the destination hasn't
been running (which you might if you started the destination with -S
and hadn't got around to 'c' ing it before the postcopy failed).
Even then it's a small window but potentially you could hit if
there's a problem loading the devices on the destination.

This corresponds to:
https://bugzilla.redhat.com/show_bug.cgi?id=1355683

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1468601086-32117-1-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-07-22 13:23:09 +05:30
Paolo Bonzini 0b8b8753e4 coroutine: move entry argument to qemu_coroutine_create
In practice the entry argument is always known at creation time, and
it is confusing that sometimes qemu_coroutine_enter is used with a
non-NULL argument to re-enter a coroutine (this happens in
block/sheepdog.c and tests/test-coroutine.c).  So pass the opaque value
at creation time, for consistency with e.g. aio_bh_new.

Mostly done with the following semantic patch:

@ entry1 @
expression entry, arg, co;
@@
- co = qemu_coroutine_create(entry);
+ co = qemu_coroutine_create(entry, arg);
  ...
- qemu_coroutine_enter(co, arg);
+ qemu_coroutine_enter(co);

@ entry2 @
expression entry, arg;
identifier co;
@@
- Coroutine *co = qemu_coroutine_create(entry);
+ Coroutine *co = qemu_coroutine_create(entry, arg);
  ...
- qemu_coroutine_enter(co, arg);
+ qemu_coroutine_enter(co);

@ entry3 @
expression entry, arg;
@@
- qemu_coroutine_enter(qemu_coroutine_create(entry), arg);
+ qemu_coroutine_enter(qemu_coroutine_create(entry, arg));

@ reentry @
expression co;
@@
- qemu_coroutine_enter(co, NULL);
+ qemu_coroutine_enter(co);

except for the aforementioned few places where the semantic patch
stumbled (as expected) and for test_co_queue, which would otherwise
produce an uninitialized variable warning.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-07-13 13:26:02 +02:00
Peter Maydell 4d88513157 migration: Don't use *_to_cpup() and cpu_to_*w()
The *_to_cpup() and cpu_to_*w() functions just compose a pointer
dereference with a byteswap. Instead use ld*_p() and st*_p(),
which handle potential pointer misalignment and avoid the need
to cast the pointer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1465574962-2710-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:23:49 +05:30
Daniel P. Berrange 22724f4921 migration: rename functions to starting migrations
Apply the following renames for starting incoming migration:

 process_incoming_migration -> migration_fd_process_incoming
 migration_set_incoming_channel -> migration_channel_process_incoming
 migration_tls_set_incoming_channel -> migration_tls_channel_process_incoming

and for starting outgoing migration:

 migration_set_outgoing_channel -> migration_channel_connect
 migration_tls_set_outgoing_channel -> migration_tls_channel_connect

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1464776234-9910-3-git-send-email-berrange@redhat.com
Message-Id: <1464776234-9910-3-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-16 09:51:37 +05:30
Dr. David Alan Gilbert 096631bd95 Postcopy: Check for support when setting the capability
Knowing whether the destination host supports migration with
postcopy can be tricky.
The destination doesn't need the capability set, however
if we set it then use the opportunity to do the test and
tell the user/management layer early.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1465816605-29488-7-git-send-email-dgilbert@redhat.com
Message-Id: <1465816605-29488-7-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-16 09:50:07 +05:30
Dr. David Alan Gilbert d3bf5418e2 Postcopy: Add stats on page requests
On the source, add a count of page requests received from the
destination.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-id: 1465816605-29488-4-git-send-email-dgilbert@redhat.com
Message-Id: <1465816605-29488-4-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-16 09:50:07 +05:30
Dr. David Alan Gilbert a22463a5dc Migration: Split out ram part of qmp_query_migrate
The RAM section of qmp_query_migrate is reasonably complex
and repeated 3 times.  Split it out into a helper.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1465816605-29488-3-git-send-email-dgilbert@redhat.com
Reviwed-by: Denis V. Lunev <den@openvz.org>
Message-Id: <1465816605-29488-3-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-16 09:50:07 +05:30
Daniel P. Berrange e122636562 migration: add support for encrypting data with TLS
This extends the migration_set_incoming_channel and
migration_set_outgoing_channel methods so that they
will automatically wrap the QIOChannel in a
QIOChannelTLS instance if TLS credentials are configured
in the migration parameters.

This allows TLS to work for tcp, unix, fd and exec
migration protocols. It does not (currently) work for
RDMA since it does not use these APIs, but it is
unlikely that TLS would be desired with RDMA anyway
since it would degrade the performance to that seen
with TCP defeating the purpose of using RDMA.

On the target host, QEMU would be launched with a set
of TLS credentials for a server endpoint

 $ qemu-system-x86_64 -monitor stdio -incoming defer \
    -object tls-creds-x509,dir=/home/berrange/security/qemutls,endpoint=server,id=tls0 \
    ...other args...

To enable incoming TLS migration 2 monitor commands are
then used

  (qemu) migrate_set_str_parameter tls-creds tls0
  (qemu) migrate_incoming tcp:myhostname:9000

On the source host, QEMU is launched in a similar
manner but using client endpoint credentials

 $ qemu-system-x86_64 -monitor stdio \
    -object tls-creds-x509,dir=/home/berrange/security/qemutls,endpoint=client,id=tls0 \
    ...other args...

To enable outgoing TLS migration 2 monitor commands are
then used

  (qemu) migrate_set_str_parameter tls-creds tls0
  (qemu) migrate tcp:otherhostname:9000

Thanks to earlier improvements to error reporting,
TLS errors can be seen 'info migrate' when doing a
detached migration. For example:

  (qemu) info migrate
  capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off x-postcopy-ram: off
  Migration status: failed
  total time: 0 milliseconds
  error description: TLS handshake failed: The TLS connection was non-properly terminated.

Or

  (qemu) info migrate
  capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off x-postcopy-ram: off
  Migration status: failed
  total time: 0 milliseconds
  error description: Certificate does not match the hostname localhost

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-27-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:32:13 +05:30
Daniel P. Berrange 69ef1f36b0 migration: define 'tls-creds' and 'tls-hostname' migration parameters
Define two new migration parameters to be used with TLS encryption.
The 'tls-creds' parameter provides the ID of an instance of the
'tls-creds' object type, or rather a subclass such as 'tls-creds-x509'.
Providing these credentials will enable use of TLS on the migration
data stream.

If using x509 certificates, together with a migration URI that does
not include a hostname, the 'tls-hostname' parameter provides the
hostname to use when verifying the server's x509 certificate. This
allows TLS to be used in combination with fd: and exec: protocols
where a TCP connection is established by a 3rd party outside of
QEMU.

NB, this requires changing the migrate_set_parameter method in the
HMP to accept a 's' (string) value instead of 'i' (integer). This
is backwards compatible, because the parsing of strings allows the
quotes to be optional, thus any integer is also a valid string.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-26-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:32:10 +05:30
Daniel P. Berrange 2594f56d4c migration: don't use an array for storing migrate parameters
The MigrateState struct uses an array for storing migration
parameters. This presumes that all future parameters will
be integers too, which is not going to be the case. There
is no functional reason why an array is used, if anything
it makes the code less clear. The QAPI schema already
defines a struct - MigrationParameters - capable of storing
all the individual parameters, so just use that instead of
an array.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-25-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:32:07 +05:30
Daniel P. Berrange 527792fae6 migration: convert exec socket protocol to use QIOChannel
Convert the exec socket migration protocol driver to use
QIOChannel and QEMUFileChannel, instead of the stdio
popen APIs. It can be unconditionally built because the
QIOChannelCommand class can report suitable error messages
on platforms which can't fork processes.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-17-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:47 +05:30
Daniel P. Berrange 64802ee57f migration: convert fd socket protocol to use QIOChannel
Convert the fd socket migration protocol driver to use
QIOChannel and QEMUFileChannel, instead of plain sockets
APIs. It can be unconditionally built because the
QIOChannel APIs it uses will take care to report suitable
error messages if needed.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-16-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:45 +05:30
Daniel P. Berrange d984464eb9 migration: convert unix socket protocol to use QIOChannel
Convert the unix socket migration protocol driver to use
QIOChannel and QEMUFileChannel, instead of plain sockets
APIs. It can be unconditionally built, since the socket
impl of QIOChannel will report a suitable error on platforms
where UNIX sockets are unavailable.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-13-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:37 +05:30
Daniel P. Berrange 61b67d473d migration: convert post-copy to use QIOChannelBuffer
The post-copy code does some I/O to/from an intermediate
in-memory buffer rather than direct to the underlying
I/O channel. Switch this code to use QIOChannelBuffer
instead of QEMUSizedBuffer.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-12-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:34 +05:30
Daniel P. Berrange d59ce6f344 migration: add reporting of errors for outgoing migration
Currently if an application initiates an outgoing migration,
it may or may not, get an error reported back on failure. If
the error occurs synchronously to the 'migrate' command
execution, the client app will see the error message. This
is the case for DNS lookup failures. If the error occurs
asynchronously to the monitor command though, the error
will be thrown away and the client left guessing about
what went wrong. This is the case for failure to connect
to the TCP server (eg due to wrong port, or firewall
rules, or other similar errors).

In the future we'll be adding more scope for errors to
happen asynchronously with the TLS protocol handshake.
TLS errors are hard to diagnose even when they are well
reported, so discarding errors entirely will make it
impossible to debug TLS connection problems.

Management apps which do migration are already using
'query-migrate' / 'info migrate' to check up on progress
of background migration operations and to see their end
status. This is a fine place to also include the error
message when things go wrong.

This patch thus adds an 'error-desc' field to the
MigrationInfo struct, which will be populated when
the 'status' is set to 'failed':

(qemu) migrate -d tcp:localhost:9001
(qemu) info migrate
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off x-postcopy-ram: off
Migration status: failed (Error connecting to socket: Connection refused)
total time: 0 milliseconds

In the HMP, when doing non-detached migration, it is
also possible to display this error message directly
to the app.

(qemu) migrate tcp:localhost:9001
Error connecting to socket: Connection refused

Or with QMP

  {
    "execute": "query-migrate",
    "arguments": {}
  }
  {
    "return": {
      "status": "failed",
      "error-desc": "address resolution failed for myhost:9000: No address associated with hostname"
    }
  }

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-11-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:30 +05:30
Daniel P. Berrange 48f07489ed migration: add helpers for creating QEMUFile from a QIOChannel
Currently creating a QEMUFile instance from a QIOChannel is
quite simple only requiring a single call to
qemu_fopen_channel_input or  qemu_fopen_channel_output
depending on the end of migration connection.

When QEMU gains TLS support, however, there will need to be
a TLS negotiation done inbetween creation of the QIOChannel
and creation of the final QEMUFile. Introduce some helper
methods that will encapsulate this logic, isolating the
migration protocol drivers from knowledge about TLS.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-10-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:27 +05:30
Daniel P. Berrange 9e4d2b98ee migration: force QEMUFile to blocking mode for outgoing migration
Instead of relying on the default QEMUFile I/O blocking flag
state, explicitly turn on blocking I/O for outgoing migration
since it takes place in a background thread.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-8-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:21 +05:30
Daniel P. Berrange 06ad513532 migration: introduce set_blocking function in QEMUFileOps
Remove the assumption that every QEMUFile implementation has
a file descriptor available by introducing a new function
in QEMUFileOps to change the blocking state of a QEMUFile.

If not set, it will fallback to the original code using
the get_fd method.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:19 +05:30
Greg Kurz fe904ea824 migration: regain control of images when migration fails to complete
We currently have an error path during migration that can cause
the source QEMU to abort:

migration_thread()
  migration_completion()
    runstate_is_running() ----------------> true if guest is running
    bdrv_inactivate_all() ----------------> inactivate images
    qemu_savevm_state_complete_precopy()
     ... qemu_fflush()
           socket_writev_buffer() --------> error because destination fails
         qemu_fflush() -------------------> set error on migration stream
  migration_completion() -----------------> set migrate state to FAILED
migration_thread() -----------------------> break migration loop
  vm_start() -----------------------------> restart guest with inactive
                                            images

and you get:

qemu-system-ppc64: socket_writev_buffer: Got err=104 for (32768/18446744073709551615)
qemu-system-ppc64: /home/greg/Work/qemu/qemu-master/block/io.c:1342:bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed.
Aborted (core dumped)

If we try postcopy with a similar scenario, we also get the writev error
message but QEMU leaves the guest paused because entered_postcopy is true.

We could possibly do the same with precopy and leave the guest paused.
But since the historical default for migration errors is to restart the
source, this patch adds a call to bdrv_invalidate_cache_all() instead.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Message-Id: <146357896785.6003.11983081732454362715.stgit@bahia.huguette.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-23 22:19:36 +05:30
Greg Kurz 24f3902b08 savevm: fail if migration blockers are present
QEMU has currently two ways to prevent migration to occur:
- migration blocker when it depends on runtime state
- VMStateDescription.unmigratable when migration is not supported at all

This patch gathers all the logic into a single function to be called from
both the savevm and the migrate paths.

This fixes a bug with 9p, at least, where savevm would succeed and the
following would happen in the guest after loadvm:

$ ls /host
ls: cannot access /host: Protocol error

With this patch:

(qemu) savevm foo
Migration is disabled when VirtFS export path '/' is mounted in the guest
using mount_tag 'host'

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <146239057139.11271.9011797645454781543.stgit@bahia.huguette.org>

[Update subject according to Paolo's suggestion - Amit]

Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-23 21:44:08 +05:30
Jason J. Herne d85a31d1f4 migration: Promote improved autoconverge commands out of experimental state
The new autoconverge throttling commands have been tested for a release now. It
is time to move them out of the experimental state.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Message-Id: <1461262038-8197-1-git-send-email-jjherne@linux.vnet.ibm.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-23 16:05:09 +05:30
Stefan Weil cb8d4c8f54 Fix some typos found by codespell
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:27 +03:00