Commit Graph

87 Commits

Author SHA1 Message Date
chai wen
1ac362cdbd block: fix wrong order in live block migration setup
The function init_blk_migration is better to be called before
set_dirty_tracking as the reasons below.

If we want to track dirty blocks via dirty_maps on a BlockDriverState
when doing live block-migration, its correspoding 'BlkMigDevState' should be
added to block_mig_state.bmds_list first for subsequent processing.
Otherwise set_dirty_tracking will do nothing on an empty list than allocating
dirty_bitmaps for them. And bdrv_get_dirty_count will access the
bmds->dirty_maps directly, then there would be a segfault triggered.

If the set_dirty_tracking fails, qemu_savevm_state_cancel will handle
the cleanup of init_blk_migration automatically.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-06-04 11:22:39 +02:00
Fam Zheng
3718d8ab65 block: Replace in_use with operation blocker
This drops BlockDriverState.in_use with op_blockers:

  - Call bdrv_op_block_all in place of bdrv_set_in_use(bs, 1).

  - Call bdrv_op_unblock_all in place of bdrv_set_in_use(bs, 0).

  - Check bdrv_op_is_blocked() in place of bdrv_in_use(bs).

    The specific types are used, e.g. in place of starting block backup,
    bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP, ...).

    There is one exception in block_job_create, where
    bdrv_op_blocker_is_empty() is used, because we don't know the operation
    type here. This doesn't matter because in a few commits away we will drop
    the check and move it to callers that _do_ know the type.

  - Check bdrv_op_blocker_is_empty() in place of assert(!bs->in_use).

Note: there is only bdrv_op_block_all and bdrv_op_unblock_all callers at
this moment. So although the checks are specific to op types, this
changes can still be seen as identical logic with previously with
in_use. The difference is error message are improved because of blocker
error info.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2014-05-28 14:28:46 +02:00
Fam Zheng
b8afb520e4 block: Handle error of bdrv_getlength in bdrv_create_dirty_bitmap
bdrv_getlength could fail, check the return value before using it.
Return NULL and set errno if it fails. Callers are updated to handle
the error case.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2014-04-22 11:57:02 +02:00
Fam Zheng
e4654d2d94 block: per caller dirty bitmap
Previously a BlockDriverState has only one dirty bitmap, so only one
caller (e.g. a block job) can keep track of writing. This changes the
dirty bitmap to a list and creates a BdrvDirtyBitmap for each caller, the
lifecycle is managed with these new functions:

    bdrv_create_dirty_bitmap
    bdrv_release_dirty_bitmap

Where BdrvDirtyBitmap is a linked list wrapper structure of HBitmap.

In place of bdrv_set_dirty_tracking, a BdrvDirtyBitmap pointer argument
is added to these functions, since each caller has its own dirty bitmap:

    bdrv_get_dirty
    bdrv_dirty_iter_init
    bdrv_get_dirty_count

bdrv_set_dirty and bdrv_reset_dirty prototypes are unchanged but will
internally walk the list of all dirty bitmaps and set them one by one.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-11-29 13:40:33 +01:00
Peter Lieven
d32f35cbc5 block: introduce BDRV_REQ_MAY_UNMAP request flag
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Peter Lieven
aa7bfbfff7 block: add flags to bdrv_*_write_zeroes
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-11-28 10:30:51 +01:00
Fam Zheng
8442cfd034 migration: omit drive ref as we have bdrv_ref now
block-migration.c does not actually use DriveInfo anywhere.  Hence it's
safe to drive ref code, we really only care about referencing BDS.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-09-06 15:25:08 +02:00
Peter Lieven
323004a39d block-migration: efficiently encode zero blocks
this patch adds a efficient encoding for zero blocks by
adding a new flag indicating a block is completely zero.

additionally bdrv_write_zeros() is used at the destination
to efficiently write these zeroes. depending on the implementation
this avoids that the destination target gets fully provisioned.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-07-19 12:29:21 +08:00
Paolo Bonzini
9b09503752 migration: run setup callbacks out of big lock
Only the migration_bitmap_sync() call needs the iothread lock.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini
32c835ba39 migration: run pending/iterate callbacks out of big lock
This makes it possible to do blocking writes directly to the socket,
with no buffer in the middle.  For RAM, only the migration_bitmap_sync()
call needs the iothread lock.  For block migration, it is needed by
the block layer (including bdrv_drain_all and dirty bitmap access),
but because some code is shared between iterate and complete, all of
mig_save_device_dirty is run with the lock taken.

In the savevm case, the iterate callback runs within the big lock.
This is annoying because it complicates the rules.  Luckily we do not
need to do anything about it: the RAM iterate callback does not need
the iothread lock, and block migration never runs during savevm.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini
52e850dea9 block-migration: add lock
Some state is shared between the block migration code and its AIO
callbacks.  Once block migration will run outside the iothread,
the block migration code and the AIO callbacks will be able to
run concurrently.  Protect the critical sections with a separate
lock.  Do the same for completed_sectors, which can be used from
the monitor.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini
323920c4ea block-migration: document usage of state across threads
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini
13197e3cba block-migration: small preparatory changes for locking
Some small changes that will simplify the positioning of lock/unlock
primitives.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini
a55ce1c851 block-migration: remove variables that are never read
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:01 +01:00
Paolo Bonzini
d418cf57a3 block-migration: remove useless calls to blk_mig_cleanup
Now that the cancel callback is called consistently for all errors,
we can avoid doing its work in the other callbacks.

Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2013-03-11 13:32:00 +01:00
Stefan Hajnoczi
6aaa9dae80 block-migration: fix pending() and iterate() return values
The return value of .save_live_pending() is the number of bytes
remaining.  This is just an estimate because we do not know how many
blocks will be dirtied by the running guest.

Currently our return value for .save_live_pending() is wrong because it
includes dirty blocks but not in-flight bdrv_aio_readv() requests or
unsent blocks.  Crucially, it also doesn't include the bulk phase where
the entire device is transferred - therefore we risk completing block
migration before all blocks have been transferred!

The return value of .save_live_iterate() is the number of bytes
transferred this iteration.  Currently we return whether there are bytes
remaining, which is incorrect.

Move the bytes remaining calculation into .save_live_pending() and
really return the number of bytes transferred this iteration in
.save_live_iterate().

Also fix the %ld format specifier which was used for a uint64_t
argument.  PRIu64 must be use to avoid warnings on 32-bit hosts.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-id: 1360661835-28663-3-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-12 16:26:44 -06:00
Stefan Hajnoczi
2c5a7f2011 block-migration: fix block_save_iterate() return value
The .save_live_iterate() function returns 0 to continue iterating or 1
to stop iterating.

Since 16310a3cca it only ever returns 0,
leading to an infinite loop.

Return 1 if we have finished sending dirty blocks.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1360534366-26723-4-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-11 08:14:05 -06:00
Stefan Hajnoczi
9ee0cb201e block-migration: fix blk_mig_save_dirty_block() return value checking
Commit 43be3a25c9 changed the
blk_mig_save_dirty_block() return code handling.  The function's doc
comment says:

  /* return value:
   * 0: too much data for max_downtime
   * 1: few enough data for max_downtime
   */

Because of the 1 return value, callers must check for ret < 0 instead of
just:

  if (ret) { ... }

We do not want to bail when 1 is returned, only on error.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1360534366-26723-3-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-11 08:14:04 -06:00
Stefan Hajnoczi
d5f1f286ef block-migration: improve "Unknown flags" error message
Show the actual flags value and include "block migration" in the error
message so it's clear where the error is coming from.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1360534366-26723-2-git-send-email-stefanha@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2013-02-11 08:14:04 -06:00
Paolo Bonzini
50717e941b block: allow customizing the granularity of the dirty bitmap
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:34 +01:00
Paolo Bonzini
acc906c6c5 block: return count of dirty sectors, not chunks
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-01-25 18:18:33 +01:00
Juan Quintela
e4ed1541ac savevm: New save live migration method: pending
Code just now does (simplified for clarity)

    if (qemu_savevm_state_iterate(s->file) == 1) {
       vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
       qemu_savevm_state_complete(s->file);
    }

Problem here is that qemu_savevm_state_iterate() returns 1 when it
knows that remaining memory to sent takes less than max downtime.

But this means that we could end spending 2x max_downtime, one
downtime in qemu_savevm_iterate, and the other in
qemu_savevm_state_complete.

Changed code to:

    pending_size = qemu_savevm_state_pending(s->file, max_size);
    DPRINTF("pending size %lu max %lu\n", pending_size, max_size);
    if (pending_size >= max_size) {
        ret = qemu_savevm_state_iterate(s->file);
     } else {
        vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
        qemu_savevm_state_complete(s->file);
     }

So what we do is: at current network speed, we calculate the maximum
number of bytes we can sent: max_size.

Then we ask every save_live section how much they have pending.  If
they are less than max_size, we move to complete phase, otherwise we
do an iterate one.

This makes things much simpler, because now individual sections don't
have to caluclate the bandwidth (it was implossible to do right from
there).

Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-20 23:09:25 +01:00
Paolo Bonzini
9c17d615a6 softmmu: move include files to include/sysemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:45 +01:00
Paolo Bonzini
1de7afc984 misc: move include files to include/qemu/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:32:39 +01:00
Paolo Bonzini
caf71f86a3 migration: move include files to include/migration/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:32 +01:00
Paolo Bonzini
737e150e89 block: move include files to include/block/
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2012-12-19 08:31:31 +01:00
Juan Quintela
43be3a25c9 block-migration: handle errors with the return codes correctly
Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-17 18:34:59 +02:00
Juan Quintela
ceb2bd09a1 block-migration: Switch meaning of return value
Make consistent the result of blk_mig_save_dirty_block() and
mig_save_device_dirty()

Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-17 18:34:59 +02:00
Juan Quintela
59feec4247 block-migration: make flush_blks() return errors
This means we don't need to pass through qemu_file to get the errors.
Adjust all callers.

Signed-off-by: Juan Quintela <quintela@redhat.com>

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2012-10-17 18:34:59 +02:00
Kevin Wolf
946d58be15 block-migration: Flush requests in blk_mig_cleanup
When cancelling block migration, all in-flight requests of the block
migration must be completed before the data can be freed. This was
visible as failing assertions and segfaults.

Reported-by: Peter Lieven <pl@dlhnet.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2012-09-28 17:43:28 +02:00
Juan Quintela
16310a3cca savevm: split save_live into stage2 and stage3
We split it into 2 functions, foo_live_iterate, and foo_live_complete.
At this point, we only remove the bits that are for the other stage,
functionally this is equivalent to previous code.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-07-20 08:19:27 +02:00
Juan Quintela
d1315aac6e savevm: split save_live_setup from save_live_state
This patch splits stage 1 to its own function for both save_live
users, ram and block.  It is just a copy of the function, removing the
parts of the other stages.  Optimizations would came later.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-07-20 08:19:27 +02:00
Juan Quintela
6bd6878133 savevm: introduce is_active method
Enable the creation of a method to tell migration if that section is
active and should be migrate.  We use it for blk-migration, that is
normally not active.  We don't create the method for RAM, as setups
without RAM are very strange O:-)

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-07-20 08:19:27 +02:00
Juan Quintela
9b5bfab05f savevm: Refactor cancel operation in its own operation
Intead of abusing stage with value -1.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-07-20 08:19:27 +02:00
Juan Quintela
7908c78d3e savevm: Live migration handlers register the struct directly
Notice that the live migration users never unregister, so no problem
about freeing the ops structure.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2012-07-20 08:19:27 +02:00
Isaku Yamahata
6607ae235b Add MigrationParams structure
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
2012-06-29 13:18:21 +02:00
Luiz Capitulino
539de1246d Purge migration of (almost) everything to do with monitors
The Monitor object is passed back and forth within the migration/savevm
code so that it can print errors and progress to the user.

However, that approach assumes a HMP monitor, being completely invalid
in QMP.

This commit drops almost every single usage of the Monitor object, all
monitor_printf() calls have been converted into DPRINTF() ones.

There are a few remaining Monitor objects, those are going to be dropped
by the next commit.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2012-03-15 10:39:52 -03:00
Paolo Bonzini
6b620ca3b0 prepare for future GPLv2+ relicensing
All files under GPLv2 will get GPLv2+ changes starting tomorrow.
event_notifier.c and exec-obsolete.h were only ever touched by Red Hat
employees and can be relicensed now.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2012-01-13 10:55:56 -06:00
Paolo Bonzini
ad54ae80c7 block: bdrv_aio_* do not return NULL
Initially done with the following semantic patch:

@ rule1 @
expression E;
statement S;
@@
  E =
(
   bdrv_aio_readv
|  bdrv_aio_writev
|  bdrv_aio_flush
|  bdrv_aio_discard
|  bdrv_aio_ioctl
)
     (...);
(
- if (E == NULL) { ... }
|
- if (E)
    { <... S ...> }
)

which however missed the occurrence in block/blkverify.c
(as it should have done), and left behind some unused
variables.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-15 12:40:07 +01:00
Stefan Hajnoczi
922453bca6 block: convert qemu_aio_flush() calls to bdrv_drain_all()
Many places in QEMU call qemu_aio_flush() to complete all pending
asynchronous I/O.  Most of these places actually want to drain all block
requests but there is no block layer API to do so.

This patch introduces the bdrv_drain_all() API to wait for requests
across all BlockDriverStates to complete.  As a bonus we perform checks
after qemu_aio_wait() to ensure that requests really have finished.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-12-05 14:56:06 +01:00
Stefan Weil
4238e26416 Fix some spelling bugs in documentation and comments
These errors were detected by codespell:

remaing -> remaining
soley -> solely
virutal -> virtual
seperate -> separate

libcacard.txt still needs some more patches.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-11-17 12:57:36 +00:00
Juan Quintela
2975725f6b migration: make *save_live return errors
Make *save_live() return negative values when there is one error, and
updates all callers to check for the error.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2011-10-20 13:23:52 +02:00
Juan Quintela
42802d47dd migration: use qemu_file_get_error() return value when possible
Signed-off-by: Juan Quintela <quintela@redhat.com>
2011-10-20 13:23:52 +02:00
Juan Quintela
624b9cc209 migration: rename qemu_file_has_error to qemu_file_get_error
Now the function returned errno, so it is better the new name.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
2011-10-20 13:23:52 +02:00
Juan Quintela
dcd1d224df migration: change has_error to contain errno values
We normally already have an errno value.  When not, abuse EIO.

Signed-off-by: Juan Quintela <quintela@redhat.com>
2011-10-20 13:23:52 +02:00
Anthony Liguori
7267c0947d Use glib memory allocation and free functions
qemu_malloc/qemu_free no longer exist after this commit.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-08-20 23:01:08 -05:00
Markus Armbruster
6daf194dde Strip trailing '\n' from error_report()'s first argument
error_report() prepends location, and appends a newline.  The message
constructed from the arguments should not contain a newline.  Fix the
obvious offenders.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
2011-06-24 09:13:36 +01:00
Avishay Traeger
ff5c52a379 Improve accuracy of block migration bandwidth calculation
block_mig_state.total_time is currently the sum of the read request
latencies.  This is not very accurate because block migration uses aio and
so several requests can be submitted at once.  Bandwidth should be computed
with wall-clock time, not by adding the latencies.  In this case,
"total_time" has a higher value than it should, and so the computed
bandwidth is lower than it is in reality.  This means that migration can
take longer than it needs to.
However, we don't want to use pure wall-clock time here.  We are computing
bandwidth in the asynchronous phase, where the migration repeatedly wakes
up and sends some aio requests.  The computed bandwidth will be used for
synchronous transfer.

Signed-off-by: Avishay Traeger <avishay@il.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-04-27 14:36:57 +02:00
Avishay Traeger
155eb9aa09 Fix integer overflow in block migration bandwidth calculation
block_mig_state.reads is an int, and multiplying by BLOCK_SIZE yielded a
negative number, resulting in a negative bandwidth (running on a 32-bit
machine).  Change order to avoid.

Signed-off-by: Avishay Traeger <avishay@il.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-04-07 13:51:48 +02:00
Marcelo Tosatti
8591675f44 block: enable in_use flag
Set block device in use during block migration, disallow drive_del and
bdrv_truncate for in use devices.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2011-02-07 12:51:19 +01:00