Commit Graph

84 Commits

Author SHA1 Message Date
Denis V. Lunev 0aa6aefc9c migration (ordinary): move bdrv_invalidate_cache_all of of coroutine context
There is a possibility to hit an assert in qcow2_get_specific_info that
s->qcow_version is undefined. This happens when VM in starting from
suspended state, i.e. it processes incoming migration, and in the same
time 'info block' is called.

The problem is that qcow2_invalidate_cache() closes the image and
memset()s BDRVQcowState in the middle.

The patch moves processing of bdrv_invalidate_cache_all out of
coroutine context for standard migration to avoid that.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456304019-10507-2-git-send-email-den@openvz.org>

[Amit: Fix a use-after-free bug]

Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-26 20:39:50 +05:30
Dr. David Alan Gilbert b82fc321bf Postcopy+spice: Pass spice migration data earlier
Spice hooks the migration status changes to figure out when to
transmit information to the new spice server; but the migration
status in postcopy doesn't quite fit - the destination starts
running before the end of the source migration.

It's not a case of hanging off the migration status change to
postcopy-active either, since that happens before we stop the
guest CPU.

Fix it by sending a notify just after sending the device state,
and adding a flag that can be tested by the notify receiver.

Symptom:
   spice handover doesn't work with the error:
   red_worker.c:11540:display_channel_wait_for_migrate_data: timeout

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-id: 1456161452-25318-1-git-send-email-dgilbert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-23 12:05:02 +01:00
Wei Yang 5b648de0ee rdma: remove check on time_spent when calculating mbs
Within the if statement, time_spent is assured to be non-zero.

This patch just removes the check on time_spent when calculating mbs.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:46 +03:00
Liang Li b33dc45c3f migration: remove useless code.
Since 's->state' will be set in migrate_init(), there is no
need to set it before calling migrate_init(). The code and
the related comments can be removed.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1453875065-24326-1-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05 19:09:50 +05:30
zhanghailiang 89a02a9f7b migration: rename 'file' in MigrationState to 'to_dst_file'
Rename the 'file' member of MigrationState to 'to_dst_file' to
be consistent with to_src_file, from_src_file and from_dst_file.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1452829066-9764-3-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05 19:09:50 +05:30
Peter Maydell 1393a48526 migration: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-2-git-send-email-peter.maydell@linaro.org
2016-01-29 15:07:22 +00:00
Kevin Wolf 76b1c7fe1c block: Inactivate BDS when migration completes
So far, live migration with shared storage meant that the image is in a
not-really-ready don't-touch-me state on the destination while the
source is still actively using it, but after completing the migration,
the image was fully opened on both sides. This is bad.

This patch adds a block driver callback to inactivate images on the
source before completing the migration. Inactivation means that it goes
to a state as if it was just live migrated to the qemu instance on the
source (i.e. BDRV_O_INACTIVE is set). You're then supposed to continue
either on the source or on the destination, which takes ownership of the
image.

A typical migration looks like this now with respect to disk images:

1. Destination qemu is started, the image is opened with
   BDRV_O_INACTIVE. The image is fully opened on the source.

2. Migration is about to complete. The source flushes the image and
   inactivates it. Now both sides have the image opened with
   BDRV_O_INACTIVE and are expecting the other side to still modify it.

3. One side (the destination on success) continues and calls
   bdrv_invalidate_all() in order to take ownership of the image again.
   This removes BDRV_O_INACTIVE on the resuming side; the flag remains
   set on the other side.

This ensures that the same image isn't written to by both instances
(unless both are resumed, but then you get what you deserve). This is
important because .bdrv_close for non-BDRV_O_INACTIVE images could write
to the image file, which is definitely forbidden while another host is
using the image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2016-01-20 13:36:23 +01:00
zhanghailiang 93d7af6ff0 migration: Add state records for migration incoming
For migration destination, we also need to know its state,
we will use it in COLO.

Here we add a new member 'state' for MigrationIncomingState,
and also use migrate_set_state() to modify its value.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>

dgilbert: Fixed early free of MigraitonIncomingState
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1450266458-3178-3-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-01-13 16:01:24 +05:30
zhanghailiang 48781e5bf2 migration: Export migrate_set_state()
Change the first parameter of migrate_set_state(), and export it.
We will use it in a later patch to update incoming state.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

dgilbert: Updated comment as per Juan's review
Message-Id: <1450266458-3178-2-git-send-email-dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-01-13 16:00:39 +05:30
Eric Blake 7fb1cf1606 qapi: Don't let implicit enum MAX member collide
Now that we guarantee the user doesn't have any enum values
beginning with a single underscore, we can use that for our
own purposes.  Renaming ENUM_MAX to ENUM__MAX makes it obvious
that the sentinel is generated.

This patch was mostly generated by applying a temporary patch:

|diff --git a/scripts/qapi.py b/scripts/qapi.py
|index e6d014b..b862ec9 100644
|--- a/scripts/qapi.py
|+++ b/scripts/qapi.py
|@@ -1570,6 +1570,7 @@ const char *const %(c_name)s_lookup[] = {
|     max_index = c_enum_const(name, 'MAX', prefix)
|     ret += mcgen('''
|     [%(max_index)s] = NULL,
|+// %(max_index)s
| };
| ''',
|                max_index=max_index)

then running:

$ cat qapi-{types,event}.c tests/test-qapi-types.c |
    sed -n 's,^// \(.*\)MAX,s|\1MAX|\1_MAX|g,p' > list
$ git grep -l _MAX | xargs sed -i -f list

The only things not generated are the changes in scripts/qapi.py.

Rejecting enum members named 'MAX' is now useless, and will be dropped
in the next patch.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1447836791-369-23-git-send-email-eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
[Rebased to current master, commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-12-17 08:21:28 +01:00
Paolo Bonzini a694ee343d migration: do floating-point division
Dividing integer expressions transferred_bytes and time_spent, and then converting
the integer quotient to type double. Any remainder, or fractional part of the
quotient, is ignored.  Fix this.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-12-03 00:03:00 +01:00
Dr. David Alan Gilbert 5df5416e63 Unneeded NULL check
The check is unneccesary, we read the value at the start of the
thread, use it, and never change it.  The value is checked to be
non-NULL before thread creation.

Spotted by coverity, CID 1339211

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-19 11:49:53 +01:00
Dr. David Alan Gilbert 95a7788b2f migration: Dead assignment of current_time
I set current_time before the postcopy test but never use it;
(I think this was from the original version where it was time based).
Spotted by coverity, CID 1339208

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-19 11:49:53 +01:00
Dr. David Alan Gilbert 389775d1f6 migration_init: Fix lock initialisation/make it explicit
Peter reported a lock error on MacOS after my a82d593b
patch.

migrate_get_current does one-time initialisation of
a bunch of variables.
migrate_init does reinitialisation even on a 2nd
migrate after a cancel.

The problem here was that I'd initialised the mutex
in migrate_get_current, and the memset in migrate_init
corrupted it.

Remove the memset and replace it by explicit initialisation
of fields that need initialising; this also turns out to be simpler
than the old code that had to preserve some fields.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: a82d593b
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-12 17:55:27 +01:00
Dr. David Alan Gilbert a54d340b9d migrate-start-postcopy: Improve text
Improve the text in both the qapi-schema and hmp help to point out
you need to set the postcopy-ram capability prior to issuing
migrate-start-postcopy.

Also fix the text of the migrate_start_postcopy error that
deals with capabilities.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Acked-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-12 17:54:39 +01:00
Dr. David Alan Gilbert 1c0d249ddf Finish non-postcopiable iterative devices before package
Where we have iterable, but non-postcopiable devices (e.g. htab
or block migration), complete them before forming the 'package'
but with the CPUs stopped.  This stops them filling up the package.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-12 17:52:29 +01:00
Dr. David Alan Gilbert e9bef235d9 End of migration for postcopy
Tweak the end of migration cleanup; we don't want to close stuff down
at the end of the main stream, since the postcopy is still sending pages
on the other thread.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:28 +01:00
Dr. David Alan Gilbert c76201ab52 Start up a postcopy/listener thread ready for incoming page data
The loading of a device state (during postcopy) may access guest
memory that's still on the source machine and thus might need
a page fill; split off a separate thread that handles the incoming
page data so that the original incoming migration code can finish
off the device data.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:28 +01:00
Dr. David Alan Gilbert 35ecd943e7 Don't iterate on precopy-only devices during postcopy
During the postcopy phase we must not call the iterate method on
precopy-only devices, since they may have done some cleanup during
the _complete call at the end of the precopy phase.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:28 +01:00
Dr. David Alan Gilbert 6c595cdee1 Page request: Process incoming page request
On receiving MIG_RPCOMM_REQ_PAGES look up the address and
queue the page.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:27 +01:00
Dr. David Alan Gilbert 1e2d90ebc5 Page request: Add MIG_RP_MSG_REQ_PAGES reverse command
Add MIG_RP_MSG_REQ_PAGES command on Return path for the postcopy
destination to request a page from the source.

Two versions exist:
   MIG_RP_MSG_REQ_PAGES_ID that includes a RAMBlock name and start/len
   MIG_RP_MSG_REQ_PAGES that just has start/len for use with the same
                        RAMBlock as a previous MIG_RP_MSG_REQ_PAGES_ID

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:27 +01:00
Dr. David Alan Gilbert b10ac0c42c Postcopy: End of iteration
The end of migration in postcopy is a bit different since some of
the things normally done at the end of migration have already been
done on the transition to postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:27 +01:00
Dr. David Alan Gilbert 1d34e4bf6a Postcopy: Postcopy startup in migration thread
Rework the migration thread to setup and start postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:27 +01:00
Dr. David Alan Gilbert e0b266f01d migration_completion: Take current state
Soon we'll be in either ACTIVE or POSTCOPY_ACTIVE when we
complete migration, and we need to know which we expect to be
in to change state safely.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:27 +01:00
Dr. David Alan Gilbert 9ec055ae29 MIGRATION_STATUS_POSTCOPY_ACTIVE: Add new migration state
'MIGRATION_STATUS_POSTCOPY_ACTIVE' is entered after migrate_start_postcopy

'migration_in_postcopy' is provided for other sections to know if
they're in postcopy.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:26 +01:00
Dr. David Alan Gilbert 36f48567b8 migration_completion: Take current state
Soon we'll be in either ACTIVE or POSTCOPY_ACTIVE when we
complete migration, and we need to know which we expect to be
in to change state safely.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:26 +01:00
Dr. David Alan Gilbert 4886a1bcb7 migrate_start_postcopy: Command to trigger transition to postcopy
Once postcopy is enabled (with migrate_set_capability), the migration
will still start on precopy mode.  To cause a transition into postcopy
the:

  migrate_start_postcopy

command must be issued.  Postcopy will start sometime after this
(when it's next checked in the migration loop).

Issuing the command before migration has started will error,
and issuing after it has finished is ignored.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:26 +01:00
Dr. David Alan Gilbert c31b098f64 Modify save_live_pending for postcopy
Modify save_live_pending to return separate postcopiable and
non-postcopiable counts.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:26 +01:00
Dr. David Alan Gilbert 093e3c4296 Add wrappers and handlers for sending/receiving the postcopy-ram migration messages.
The state of the postcopy process is managed via a series of messages;
   * Add wrappers and handlers for sending/receiving these messages
   * Add state variable that track the current state of postcopy

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:26 +01:00
Dr. David Alan Gilbert 53dd370ced Add migration-capability boolean for postcopy-ram.
The 'postcopy ram' capability allows postcopy migration of RAM;
note that the migration starts off in precopy mode until
postcopy mode is triggered (see the migrate_start_postcopy
patch later in the series).

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:26 +01:00
Dr. David Alan Gilbert 7b89bf279f Rework loadvm path for subloops
Postcopy needs to have two migration streams loading concurrently;
one from memory (with the device state) and the other from the fd
with the memory transactions.

Split the core of qemu_loadvm_state out so we can use it for both.

Allow the inner loadvm loop to quit and cause the parent loops to
exit as well.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:26 +01:00
Dr. David Alan Gilbert 70b2047774 Return path: Source handling of return path
Open a return path, and handle messages that are received upon it.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:26 +01:00
Dr. David Alan Gilbert f6844b99ce migration_is_setup_or_active
Add 'migration_is_setup_or_active' utility function to check state.
(It gets postcopy added to it's list later on in the series)

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:26 +01:00
Dr. David Alan Gilbert 6decec9311 Return path: Send responses from destination to source
Add migrate_send_rp_message to send a message from destination to source along the return path.
  (It uses a mutex to let it be called from multiple threads)
Add migrate_send_rp_shut to send a 'shut' message to indicate
  the destination is finished with the RP.
Add migrate_send_rp_ack to send a 'PONG' message in response to a PING
  Use it in the MSG_RP_PING handler

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 15:00:26 +01:00
Dr. David Alan Gilbert a3e06c3d13 Rename save_live_complete to save_live_complete_precopy
In postcopy we're going to need to perform the complete phase
for postcopiable devices at a different point, start out by
renaming all of the 'complete's to make the difference obvious.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 14:51:49 +01:00
Dr. David Alan Gilbert aefeb18bde migrate_init: Call from savevm
Suspend to file is very much like a migrate, and it makes life
easier if we have the Migration state available, so initialise it
in the savevm.c code for suspending.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewd-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 14:51:49 +01:00
Dr. David Alan Gilbert 42e2aa5637 Rename mis->file to from_src_file
'file' becomes confusing when you have flows in each direction;
rename to make it clear.
This leaves just the main forward direction ms->file, which is used
in a lot of places and is probably not worth renaming given the churn.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-11-10 14:51:48 +01:00
Liang Li ea7415fac6 migration: rename qemu_savevm_state_cancel
The function qemu_savevm_state_cancel is called after the migration
in migration_thread, it seems strange to 'cancel' it after completion,
rename it to qemu_savevm_state_cleanup looks better.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>al3
Reviewed-by: Amit Shah <amit.shah@redhat.com>al3
Signed-off-by: Juan Quintela <quintela@redhat.com>al3
2015-11-04 13:40:13 +01:00
Liang Li 94f5a43704 migration: defer migration_end & blk_mig_cleanup
Because of the patch 3ea3b7fa9af067982f34b of kvm, which introduces a
lazy collapsing of small sptes into large sptes mechanism, now
migration_end() is a time consuming operation because it calls
memroy_global_dirty_log_stop(), which will trigger the dropping of small
sptes operation and takes about dozens of milliseconds, so call
migration_end() before all the vmsate data has already been transferred
to the destination will prolong VM downtime. This operation should be
deferred after all the data has been transferred to the destination.

blk_mig_cleanup() can be deferred too.

For a VM with 8G RAM, this patch can reduce the VM downtime about 30 ms.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>al3
Reviewed-by: Amit Shah <amit.shah@redhat.com>al3
Signed-off-by: Juan Quintela <quintela@redhat.com>al3
2015-11-04 13:40:13 +01:00
Amit Shah 92e3762237 migration: announce VM's new home just before VM is runnable
We were announcing the dest host's IP as our new IP a bit too soon -- if
there were errors detected after this announcement was done, the
migration is failed and the VM could continue running on the src host --
causing problems later.

Move around the qemu_announce_self() call so it's done just before the
VM is runnable.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2015-10-15 08:13:03 +02:00
Dr. David Alan Gilbert ed1f3e0090 Migration: Generate the completed event only when we complete
The current migration-completed event is generated a bit too early,
which means that an eager libvirt that's ready to go as soon
as it sees the event ends up racing with the actual end of migration.

This corresponds to RH bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1271145

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
xSigned-off-by: Juan Quintela <quintela@redhat.com>
2015-10-15 08:12:02 +02:00
Jason J. Herne dc3256272c migration: Disambiguate MAX_THROTTLE
Migration has a define for MAX_THROTTLE. Update comment to clarify that this is
used for throttling transfer speed. Hopefully this will prevent it from being
confused with a guest cpu throttling entity.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
2015-09-30 09:42:04 +02:00
Jason J. Herne 4782893e09 qmp/hmp: Add throttle ratio to query-migrate and info migrate
Report throttle percentage in info migrate and query-migrate responses when
cpu throttling is active.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
2015-09-30 09:42:04 +02:00
Jason J. Herne 070afca258 migration: Dynamic cpu throttling for auto-converge
Remove traditional auto-converge static 30ms throttling code and replace it
with a dynamic throttling algorithm.

Additionally, be more aggressive when deciding when to start throttling.
Previously we waited until four unproductive memory passes. Now we begin
throttling after only two unproductive memory passes. Four seemed quite
arbitrary and only waiting for two passes allows us to complete the migration
faster.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
2015-09-30 09:42:04 +02:00
Jason J. Herne 1626fee3bd migration: Parameters for auto-converge cpu throttling
Add migration parameters to allow the user to adjust the parameters
that control cpu throttling when auto-converge is in effect. The added
parameters are as follows:

x-cpu-throttle-initial : Initial percantage of time guest cpus are throttled
when migration auto-converge is activated.

x-cpu-throttle-increment: throttle percantage increase each time
auto-converge detects that migration is not making progress.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
2015-09-30 09:42:04 +02:00
Markus Armbruster 97f3ad3551 migration: Use g_new() & friends where that makes obvious sense
g_new(T, n) is neater than g_malloc(sizeof(T) * n).  It's also safer,
for two reasons.  One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.

This commit only touches allocations with size arguments of the form
sizeof(T).  Same Coccinelle semantic patch as in commit b45c03f.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1442231491-23352-1-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-09-29 11:36:35 +05:30
Dr. David Alan Gilbert 09f6c85e39 Split out end of migration code from migration_thread
The code that gets run at the end of the migration process
is getting large, and I'm about to add more for postcopy.
Split it into a separate function.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1439463094-5394-3-git-send-email-dgilbert@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2015-09-29 11:33:02 +05:30
Anthony PERARD c69adea462 migration: Fix global state with Xen.
When doing migration via the QMP command xen_save_devices_state, the
current runstate is not store into the global state section. Also the
current runstate is not the one we want on the receiver side.

During migration, the Xen toolstack paused QEMU before save the devices
state. Also, the toolstack expect QEMU to autostart when the migration is
finished.
So this patch store "running" as it's current runstate.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2015-08-03 16:13:23 +00:00
Paolo Bonzini ab28bd2312 rcu: actually register threads that have RCU read-side critical sections
Otherwise, grace periods are detected too early!

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-07-24 13:57:45 +02:00
Juan Quintela 560d027b54 migration: We also want to store the global state for savevm
Commit df4b102452 introduced global_state
section.  But it only filled the state while doing migration.  While
doing a savevm, we stored an empty string as state.  So when we did a
loadvm, it complained that state was invalid.

Fedora 21, 4.1.1, qemu 2.4.0-rc0
> ../../configure --target-list="x86_64-softmmu"

068 2s ... - output mismatch (see 068.out.bad)
--- /home/bos/jhuston/src/qemu/tests/qemu-iotests/068.out	2015-07-08
17:56:18.588164979 -0400
+++ 068.out.bad	2015-07-09 17:39:58.636651317 -0400
@@ -6,6 +6,8 @@
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) savevm 0
 (qemu) quit
+qemu-system-x86_64: Unknown savevm section or instance 'globalstate' 0
+qemu-system-x86_64: Error -22 while loading VM state
 QEMU X.Y.Z monitor - type 'help' for more information
 (qemu) quit
 *** done
Failures: 068
Failed 1 of 1 tests

Actually, there were two problems here:
- we registered global_state too late for load_vm (fixed on another
  patch on the list)
- we didn't store a valid state for savevm (fixed by this patch).

Reported-by: John Snow <jsnow@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Tested-by:  Christian Borntraeger <borntraeger@de.ibm.com>
2015-07-15 12:22:54 +02:00